You are on page 1of 87

Chapter 1: Introduction to Computer Vision and Image Processing

Overview: Computer Imaging


Definition

of computer imaging:

Acquisition and processing of visual information by computer.

Why

is it important?

Human primary sense is visual sense. Information can be conveyed well through images (one picture worth a thousand words). Computer is required because the amount of data to be processed is huge.

Overview: Computer Imaging


Computer

imaging can be divided into two main categories:


Computer Vision: applications of the output are for use by a computer. Image Processing: applications of the output are for use by human.

These

two categories are not totally separate and distinct.

Overview: Computer Imaging


They

overlap each other in certain areas.


COMPUTER IMAGING

Computer Vision

Image Processing

Computer Vision
Does

not involve human in the visual loop. One of the major topic within this field is image analysis (Chapter 2). Image analysis involves the examination of image data to facilitate in solving a vision problem.

Computer Vision
Image

analysis process involves two other

topics:

Feature extraction: acquiring higher level image info (shape and color) Pattern classification: using higher level image information to identify objects within image.

Computer Vision
Most

computer vision applications involve tasks that:


Are tedious for people to perform. Require work in a hostile environment. Require a high processing rate. Require access and use of a large database of information.

Computer Vision
Examples

of applications of computer vision:

Quality control (inspect circuit board). Hand-written character recognition. Biometrics verification (fingerprint, retina, DNA, signature, etc). Satellite image processing. Skin tumor diagnosis. And many, many others.

Image Processing
Processed

images are to be used by human.

Therefore, it requires some understanding on how the human visual system operates.

Among

the major topics are:

Image restoration (Chapter 3). Image enhancement (Chapter 4). Image compression (Chapter 5).

Image Processing
Image

restoration:

The process of taking an image with some know, or estimated degradation, and restoring it to its original appearance. Done by performing the reverse of the degradation process to the image. Examples: correcting distortion in the optical system of a telescope.

Image Processing

An Example of Image Restoration

Image Processing
Image

enhancement:

Improve an image visually by taking an advantage of human visual systems response. Example: improve contrast, image sharpening, and image smoothing.

Image Processing

An Example of Image Enhancement

Image Processing
Image

compression:

Remove the amount of data required to represent an image by:


Removing

unnecessary data that are visually unnecessary. Taking advantage of the redundancy that is inherent in most images.

Example: JPEG, MPEG, etc.

Computer Imaging Systems


Computer

imaging systems comprises of both hardware and software. The hardware components can be divided into three subsystems:

The computer Image acquisition: camera, scanner, video recorder. Image display: monitor, printer, film, video player.

Computer Imaging Systems


The

software is used for the following tasks:

Manipulate the image and perform any desired processing on the image data. Control the image acquisition and storage process.

The

computer system may be a generalpurpose computer with a frame grabber or image digitizer board in it.

Computer Imaging Systems


Frame

grabber is a special purpose piece of hardware that digitizes standard analog video signal. Digitization of analog video signal is important because computers can only process digital data.

Computer Imaging Systems


Digitization

is done by sampling the analog signal or instantaneously measuring the voltage of the signal at fixed interval in time. The value of the voltage at each instant is converted into a number and stored. The number represents the brightness of the image at that point.

Computer Imaging Systems


The

grabbed image is now a digital image and can be accessed as a two dimensional array of data.

Each data point is called a pixel (picture element).

The

following notation is used to express a digital image:


I(r,c) = the brightness of the image at point (r,c) where r = row and c = column.

The CVIPtools Software


CVIPtools

software contains C functions to perform all the operations that are discussed in the text book. It also comes with an application with GUI interface that allows you to perform various operations on an image.

No coding is needed. Users may vary all the parameters. Results can be observed in real time.

The CVIPtools Software


It

is available from:
The CD-ROM that comes with the book. http://www.ee.siue.edu/CVIPtools

Human Visual Perception


Human

perception encompasses both the physiological and psychological aspects. We will focus more on physiological aspects, which are more easily quantifiable and hence, analyzed.

Human Visual Perception


Why

study visual perception?

Image processing algorithms are designed based on how our visual system works. In image compression, we need to know what information is not perceptually important and can be ignored. In image enhancement, we need to know what types of operations that are likely to improve an image visually.

The Human Visual System


The

human visual system consists of two primary components the eye and the brain, which are connected by the optic nerve.

Eye receiving sensor (camera, scanner). Brain information processing unit (computer system). Optic nerve connection cable (physical wire).

The Human Visual System

The Human Visual System


This

is how human visual system works:

Light energy is focused by the lens of the eye into sensors and retina. The sensors respond to the light by an electrochemical reaction that sends an electrical signal to the brain (through the optic nerve). The brain uses the signals to create neurological patterns that we perceive as images.

The Human Visual System


The

visible light is an electromagnetic wave with wavelength range of about 380 to 825 nanometers.

However, response above 700 nanometers is minimal.

We

cannot see many parts of the electromagnetic spectrum.

The Human Visual System

The Human Visual System


The

visible spectrum can be divided into three bands:


Blue (400 to 500 nm). Green (500 to 600 nm). Red (600 to 700 nm).

The

sensors are distributed across retina.

The Human Visual System

The Human Visual System


There

are two types of sensors: rods and cones. Rods:


For night vision. See only brightness (gray level) and not color. Distributed across retina. Medium and low level resolution.

The Human Visual System


Cones:

For daylight vision. Sensitive to color. Concentrated in the central region of eye. High resolution capability (differentiate small changes).

The Human Visual System


Blind

spot:

No sensors. Place for optic nerve. We do not perceive it as a blind spot because the brain fills in the missing visual information.

Why

does an object should be in center field of vision in order to perceive it in fine detail?
This is where the cones are concentrated.

The Human Visual System


Cones

have higher resolution than rods because they have individual nerves tied to each sensor. Rods have multiple sensors tied to each nerve. Rods react even in low light but see only a single spectral band. They cannot distinguish color.

The Human Visual System

The Human Visual System


There

are three types of cones. Each responding to different wavelengths of light energy. The colors that we perceive are the combined result of the response of the three cones.

The Human Visual System

Spatial Frequency Resolution


To

understand the concept of spatial frequency, we must first understand the concept of resolution. Resolution: the ability to separate two adjacent pixels.

If we can see that two adjacent pixels as being separate, then we can say that we can resolve the two.

Spatial Frequency Resolution


Spatial

frequency: how rapidly the signal changes in space.

Spatial Frequency Resolution


If

we increase the frequency, the stripes get closer until they finally blend together.

Spatial Frequency Resolution

The distance between eye and image also affects the resolution.

The farther the image, the worse the resolution. The number of pixels per square inch on a display device must be large enough for us to see an image as being realistic. Otherwise we will end up seeing blocks of colors. There is an optimum distance between the viewer and the display device.

Why is this important?

Spatial Frequency Resolution


Limitations

of visual system in resolution are due to both optical and neural factor.
We cannot resolve things smaller than the individual sensor. Lens has finite size, which limits the amount of light it can gather. Lens is slightly yellow (which progresses with age); limits eyes response to certain wavelength of light.

Spatial Frequency Resolution


Spatial

resolution is affected by the average background brightness of the display. In general, we have higher spatial resolution at brighter levels. The visual system has less spatial resolution for color information that has been decoupled from the brightness information.

Spatial Frequency Resolution

Brightness Adaptation
The

vision system responds to a wide range of brightness levels. The perceived brightness (subjective brightness) is a logarithmic function of the actual brightness.

However, it is limited by the dark threshold (too dark) and the glare limit (too bright).

Brightness Adaptation
We

cannot see across the entire range at any one time. But our system will adapt to existing light condition. The pupil varies its size to control the amount of light coming into the eye.

Brightness Adaptation

Brightness Adaptation
It

has been experimentally determined that we can detect only about 20 changes in brightness in a small area within a complex image. However, for an entire image, about 100 gray levels are necessary to create a realistic image.

Due to brightness adaptation of our visual system.

Brightness Adaptation
If

fewer gray levels are used, we will observe false contours (bogus line). This resulted from gradually changing light intensity not being accurately presented.

Brightness Adaptation

Image with 8 bits/pixel (256 gray levels no false contour)

Image with 3 bits/pixel (8 gray levels contain false contour)

Brightness Adaptation
An

interesting phenomena that our vision system exhibits related to brightness is called the Mach Band Effect. This creates an optical illusion. When there is a sudden change in intensity, our vision system response overshoots the edge.

Brightness Adaptation
This

accentuates edges and helps us to distinguish and separates objects within an image. Combined with our brightness adaptation response, this allows us to see outlines even in dimly lit areas.

Brightness Adaptation
An illustration of the Mach Band Effect. Observe the edges between the different brightness. The edges seem to be a bit stand out compared to the rest of the image.

Brightness Adaptation

Brightness Adaptation

Temporal Resolution
Related

to how we respond to visual information as a function of time.


Useful when considering video and motion in images. Can be measured using flicker sensitivity.

Flicker

sensitivity refers to our ability to observe a flicker in a video signal displayed on a monitor.

Temporal Resolution

Temporal Resolution
The

cutoff frequency is about 50 hertz (cycles per second).


We will not perceive any flicker for a video signal above 50Hz. TV uses frequency around 60Hz.

The

brighter the lighting, the more sensitive we are to changes.

Image Representation
Digital

image I(r, c) is represented as a twodimensional array of data. Each pixel value corresponds to the brightness of the image at point (r, c). This image model is for monochrome (one color, or black and white) image data.

Image Representation
Multiband

images (color, multispectral) can be modeled by a different I(r, c) function for each separate band of brightness information. Types of images that will discuss:

Binary Gray-scale Color Multispectral

Binary Images
Takes

only two values:

Black and white (0 and 1) Requires 1 bit/pixel

Used

when the only information required is shape or outline info. For example:
To position a robotic gripper to grasp an object. To check a manufactured object for deformations. For facsimile (FAX) images.

Binary Images

Binary Images

Binary images are often created from gray-scale images via a threshold operation.

White (1) if pixel value is larger than threshold. Black (0) if it is less.

Gray-Scale Images
Also

referred to as monochrome or one-color images. Contain only brightness information. No color information. Typically contain 8 bits/pixel data, which corresponds to 256 (0 to 255) different brightness (gray) levels.

Gray-Scale Images
Why

8 bits/pixel?

Provides more than adequate brightness resolution. Provides a noise margin by allowing approximately twice gray levels as required. Byte (8-bits) is the standard small unit in computers.

Gray-Scale Images
However,

there are applications such as medical imaging or astronomy that requires 12 or 16 bits/pixel.

Useful when a small section of the image is enlarged. Allows the user to repeatedly zoom a specific area in the image.

Color Images
Modeled

as three band monochrome image

data. The values correspond to the brightness in each spectral band. Typical color images are represented as red, green and blue (RGB) images.

Color Images
Using

the 8-bit standard model, a color image would have 24 bits/pixel.


8-bits for each of the three color bands (red, green and blue).

Color Images

For many applications, RGB is transformed to a mathematical space that decouples (separates) the brightness information from color information. The transformed images would have a:

1-D brightness or luminance. 2-D color space or chrominance.

This creates a more people-oriented way of describing colors.

Color Images
One

example is the hue/saturation/lightness (HSL) color transform.


Hue: Color (green, blue, orange, etc). Saturation: How much white is in the color (pink is red with more white, so it is less saturated than pure red). Lightness: The brightness of the color.

Color Images
Most

people can relate to this method of describing color.


A deep, bright orange would have a large intensity (bright), a hue of orange and a high value of saturation (deep). It is easier to picture this color in mind. If we define this color in terms of RGB component, R = 245, G = 110, B = 20, we have no idea how this color looks like.

Color Images
In

addition to HSL, there are various other formats used for representing color images:
YCrCb SCT (Spherical Coordinate Transform) PCT (Principle Component Transform) CIE XYZ L*u*v L*a*b

Color Images
One

color space can be converted to another color space by using equations. Example: Converting RGB color space to YCrCb color space.

Multispectral Images
Typically

contain information outside normal human perceptual range.


Infrared, ultraviolet, X-ray, acoustic or radar data.

They

are not really images in usual sense (not representing scene of physical world, but rather information such as depth). Values are represented in visual form by mapping the different spectral bands to RGB.

Multispectral Images
Sources

include satellite system, underwater sonar system, airborne radar, infrared imaging systems, and medical diagnostic imaging systems. The number of bands into which the data are divided depends on the sensitivity of the imaging sensory.

Multispectral Images
Most

satellite images contain two to seven spectral bands.


One to three in the visible spectrum. One or more in the infrared region.

Newest

satellites have sensors that collect image information in 30 or more bands. Due to the large amount of data involved, compression is essential.

Digital Image File Formats


There

are many different types of image file formats. This is because:


There are many different types of images and applications with varying requirements. Lack of coordination within imaging industry.

Images

can be converted from one format to another using image conversion software.

Digital Image File Formats


Types

of image data are divided into two categories:


Bitmap (raster) images: where we have pixel data and the corresponding brightness values stored in some file format. Vector images: methods of representing lines, curves and shapes by storing only the key points. The process of turning the key points into an image is called rendering.

Digital Image File Formats


Most

of the file formats to be discussed fall under the category of bitmap images. Some of the formats are compressed.

The I(r, c) values are not available until the file is decompressed.

Bitmap

image files must contain both header information and the raw pixel data.

Digital Image File Formats


The

header contain information regarding:

The number of rows (height) The number of columns (width) The number of bands The number of bits per pixel The file type Type of compression used (if applicable)

Digital Image File Formats


BIN

format:

Only contain the raw data I(r, c) and no header. Users must know the necessary parameters beforehand.

PPM

format:

Contain raw image data with a simple header. PBM (binary), PGM (gray-scale), PPM (color) and PNM (handles any of the other types).

Digital Image File Formats


GIF

(Graphics Interchange Format):

Commonly used in WWW. Limited to a maximum of 8 bits/pixel (256 colors). The bits are used as an input to a lookup table. Allow for a type of compression called LZW. Image header is 13 bytes long.

Digital Image File Formats


TIFF

(Tagged Image File Format):

Allows a maximum of 24 bits/pixel. Support several types of compression: RLE, LZW, and JPEG. Header is of variable size and is arranged in a hierarchical manner. Designed to allow user to customize it for specific applications.

Digital Image File Formats


JFIF

(JPEG File Interchange Format):

Allows images compressed with JPEG algorithm to be used in many different computer platforms. Contains a Start of Image (SOI) and an application (APPO) marker that serves as a file header. Being used extensively in WWW.

Digital Image File Formats


Sun

Raster file format:

Defined to allow for any number of bits per pixel. Supports RLE compression and color lookup tables. Contains 32-byte header, followed by the image data.

Digital Image File Formats


SGI

file format:

Handles up to 16 million colors. Supports RLE compression. Contains 512-byte header, followed the image data. Majority of the bytes in header are not used, presumably for future extension.

Digital Image File Formats


EPS

(Encapsulated PostScript):

Not a bitmap image. The file contains text. It is a language that supports more than just images. Commonly used in desktop publishing. Directly supported by many printers (in the hardware itself). Commonly used for data interchange across hardware and software platforms. The files are very big.

You might also like