You are on page 1of 99

1

The one of purpose of this summit is to understand how Vision solutions can help making
the automations in industry very smart. When you look at most of pressing need to
automate, it would most probably fall in to one of this buckets.
1. Quality inspection
1. You have the need to improve quality of your product going out of assembly
line. Quality can be for packaging, food product or a medicine pack
2. Equipment automation
1. If you are in automotive assembly line or product packaging line you would
understand the need to automating the equipment smartly so that manual
intervention and consistent output is possible.
3. Identification and Traceability
1. Most of the countries compliance act mandates companies to have traceability
methods such as barcodes for electronics, food and medical products.
4. Consistency and Accuracy
1. For every system the measurements made must be precise and accurate
NI Vision provide many algorithms to serve variety of purpose, I am going to focus
more on algorithms which are essential to serve these four categories

Example use cases for Quality inspections and Defect elimination


-Identifying defects in metal and class is very difficult and as you can see algorithm should
be capable of finding the cracks as small as 100um
-For pharmaceutical industry
-Wrong packaging is nightmare

10

11

12

13

14

15

16

17

18

19

20

21

22

23

Pattern matching gives you information about the presence or absence, number, and
location of the model within an image. For example, you can search an image containing a
printed circuit board for one or more alignment marks, which are called fiducials. The
positions of the fiducials are used to align the board for placement of the chips by a chip
mounting device. You also can use pattern matching to locate key components in gauging
applications. In gauging applications, pattern matching locates key components and then
gauges the distance or angle between these objects. If the measurement falls within a
tolerance range, the part is considered good. If it falls outside the tolerance, the
component is rejected. In many applications, searching and finding a feature is the key
processing task that determines the success of the application.

24

25

26

27

Change brute force to something else


Scale images to match.
Show example
In demo display template.

28

29

30

Add animation to add.

31

32

33

34

National Instruments Leadership Seminar

National Instruments CONFIDENTIAL

April, 2002

35

36

National Instruments Leadership Seminar

National Instruments CONFIDENTIAL

April, 2002

37

38

39

40

41

42

43

Get the text at the bottom

44

45

46

47

Avoid the term perspective

48

49

50

51

52

53

54

55

56

57

58

59

Image credit: http://en.wikipedia.org/wiki/File:Pi_30K.gif


Note the difficulty of programming software to use multiple threads / processes.
You have to send data between threads (which incurs a memory copy, and synchronization
jitter).
Difficult to process the edges, because you have to special case it. For small data sets, the
overhead of processing the edges between processors is significant.
It is generally a manual process to take an existing single-threaded image processing
algorithm and make it multicore friendly. Many software packages (including Vision
Development Module) do this for you.
Note also that some problems divide up into multiple cores poorly (like the process of
computing pi iteratively), while others are easily massively parallel (like the process of
computing pi by generating random complex numbers between [-1-i, +1+i] and seeing how
many fit within the unit circle.

60

FPGA: field-programmable gate array. Fundamentally, an FPGA is a semi-conductor device


which contains a large quantity of gates (logic devices) which are not interconnected, and
whose function is determined by a wiring list which is downloaded to the FPGA. The wiring
list determines how the gates are interconnected, and this interconnection is performed
dynamically by turning semiconductor switches on or off to enable the different
connections.
An analogy can be made to a printed circuit board which has a large number of devices on
it, but which are not connected. The wiring list then determines which nodes are
connected. Traditionally, this could be done using physical wires soldered to the pins, or a
wire wrapping tool to connect the devices. If the wiring is embedded in the printed circuit
board, it is fixed, i.e. cannot be programmed.
Programming an FPGA requires a software tool in which the logic functions are defined. An
analysis tools then verifies the logical functions and the expected timing of the signals in
the device. A layout tool physically maps the logical devices to specific elements on the
chip and determines their actual wiring.
FPGAs are attractive because the function of a piece of hardware can be updated in the
field. Whereas microprocessors have a fixed hardware structure which then executes the
software, on an FPGA, the actual hardware is changed. It is fair to say that for each
program written for an FPGA, a new piece of dedicated hardware is designed to execute
that specific function. For this reason, FPGAs are highly efficient, especially when
performing repetitive instructions such as required for DSP or imaging. Another advantage
is that FPGAs permit the design of large number of parallel data paths, which can greatly

61

increase performance for DSP and data streaming applications. Current FPGAs have millions of gates
with up to 50 million gates per chip expected before year 2005.

61

Software-Defined Hardware = FPGA

62

63

64

Latency FPGAs for image processing have incredibly low latency (on the order of
microseconds) when they are already in the image path. This allows for extremely tight
controls loops (laser tracking, in-flight defect rejection systems)
Jitter FPGAs are extremely deterministic, since they dont have the overhead of other
threads, an operating system, interrupts. For many algorithms, you can know the exact
execution time down to nanoseconds.
Raw computation power For massively parallel computation or heavily pipelined math,
the raw computation power of an FPGA can be an advantage over a CPU-based system.
Pipelining for pixel-by-pixel operations (kernel operations, dilate, erode, edge-finding,
etc.), you can stack algorithms back-to-back with only marginal latency added
Security The image stays within the FPGA
Weight / Power / Heat an FPGA may consume 1-10 Watts of power, while a CPU may
easily consume 50-200 Watts. This is particularly useful for extreme conditions (space, air,
and underwater)
Complexity Hardware programming is a significant departure from traditional software
programming, and there is a non-trivial learning curve
Clock rates are significantly slower on an FPGA (100 MHz 200 MHz) compared to CPUs
(3.0+ GHz)
Floating point is difficult to do on an FPGA. This is mitigated somewhat by using Fixed
Point, but that is beyond the scope of this presentation.

65

Image credit: http://www.eecg.toronto.edu/~vaughn/challenge/fpga_arch.html


Latency FPGAs for image processing have incredibly low latency (on the order of
microseconds) and you can be pretty sure about the latency with which the processes will
take each time. This is important because latency in turn accounts for the time it takes till
a decision is made based on the image data. This allows for extremely tight controls loops
(laser tracking, inair sorting machines, visual-servoing, )
Host based vision starts processing after image transmitted
FPGA vision can start on first pixel and finish shortly after last pixel
Latency reduced by almost 1 frame period
Start of exposure to last result available ~600us

66

Latency FPGAs for image processing have incredibly low latency (on the order of
microseconds) when they are already in the image path. This allows for extremely tight
controls loops (laser tracking, in-flight defect rejection systems)
Jitter FPGAs are extremely deterministic, since they dont have the overhead of other
threads, an operating system, interrupts. For many algorithms, you can know the exact
execution time down to nanoseconds.
Raw computation power For massively parallel computation or heavily pipelined math,
the raw computation power of an FPGA can be an advantage over a CPU-based system.
Pipelining for pixel-by-pixel operations (kernel operations, dilate, erode, edge-finding,
etc.), you can stack algorithms back-to-back with only marginal latency added
Security The image stays within the FPGA
Weight / Power / Heat an FPGA may consume 1-10 Watts of power, while a CPU may
easily consume 50-200 Watts. This is particularly useful for extreme conditions (space, air,
and underwater)
Complexity Hardware programming is a significant departure from traditional software
programming, and there is a non-trivial learning curve
Clock rates are significantly slower on an FPGA (100 MHz 200 MHz) compared to CPUs
(3.0+ GHz)
Floating point is difficult to do on an FPGA. This is mitigated somewhat by using Fixed
Point, but that is beyond the scope of this presentation.

67

68

69

Latency FPGAs for image processing have incredibly low latency (on the order of
microseconds) when they are already in the image path. This allows for extremely tight
controls loops (laser tracking, in-flight defect rejection systems)
Jitter FPGAs are extremely deterministic, since they dont have the overhead of other
threads, an operating system, interrupts. For many algorithms, you can know the exact
execution time down to nanoseconds.
Raw computation power For massively parallel computation or heavily pipelined math,
the raw computation power of an FPGA can be an advantage over a CPU-based system.
Pipelining for pixel-by-pixel operations (kernel operations, dilate, erode, edge-finding,
etc.), you can stack algorithms back-to-back with only marginal latency added
Security The image stays within the FPGA
Weight / Power / Heat an FPGA may consume 1-10 Watts of power, while a CPU may
easily consume 50-200 Watts. This is particularly useful for extreme conditions (space, air,
and underwater)
Complexity Hardware programming is a significant departure from traditional software
programming, and there is a non-trivial learning curve
Clock rates are significantly slower on an FPGA (100 MHz 200 MHz) compared to CPUs
(3.0+ GHz)
Floating point is difficult to do on an FPGA. This is mitigated somewhat by using Fixed
Point, but that is beyond the scope of this presentation.

70

Visualization the FPGA takes in an image, changes it (highlights edges, features of


interest, masks out some features) and spits it out in hardware (reduced latency, jitter, low
complexity in the deployed system, security)
High-speed Control The FPGA takes in an image and, based on some computational
result, uses IO to communicate with the outside world. Wth Hardware in-the-loop, the
FPGA computation reduces the cycle time and jitter (latency, jitter)
Image Preprocessing the FPGA takes in an image, changes it (highlights edges or features,
runs a kernel through it, does bandwidth reduction, etc.), and DMAs it back to the host.
Takes advantage of reduced jitter, raw processing power, almost zero latency cost)
Co-Processing the host can use DMA to move an image from memory to the FPGA to take
advantage of its raw processing power (but beware of memory / PCIe bandwidth).

71

Now lets discuss the 3 main use cases for FPGA Vision. Keep in mind that this is intended
to be a general overview of where were going with FPGA Vision. PXI and FlexRIO will not
be appropriate for all of the applications.
The first use case is visualization where the FPGA processes an incoming image in realtime with the goal of enhancing it for display for human eyes. The FPGA can either output
the data directly to a monitor or it can send the enhanced image up to the host for display.

72

Many of the application areas for real-time image enhancement can be found in medical,
military and commercial areas. For example, security cameras may want to compress and
encrypt the images theyre sending back to the operators to reduce data and ensure the
information stays private.
Instead of sending images to a lab to analysis, doctors could use LabVIEW FPGA in their
medical devices to highlight certain medical features, like irregular cells, bone outlines or
tissue conditions to diagnose and treat patients in real-time.
Finally, new algorithms are emerging, such as the Retinex algorithm, that can drastically
improve the visibility and quality of an image. Imagine a truck driver or a pilot who is able
to view an enhanced image from an FPGA that seemingly removes the fog or other noise
from his view.

73

Moving to an application area that a little closer to home, image processing on LabVIEW
FPGA is particularly suited for high-speed, low-latency control applications. In these
applications, the time between when an image is acquired and an action is taken needs to
be fast and consistent. Often all the inspection and decision-making can be accomplished
on the FPGA with little or no CPU intervention. FlexRIO is particularly suited for these
types of applications and, as you will see later in the presentation, the I/O on the Camera
Link Adaptor Module for FlexRIO is designed with control in mind.

74

Applications for high-speed control include high-speed alignment, where one object needs
to stay within a given position relative to another. More common: maintaining one laser at
desired position.

Another example, that I mentioned previously, is high speed sorting. From food products
and rocks, to manufacturing good and recycled garbage, there is a huge bottle-neck for
efficiently and quickly sorting items based on color, shape, size, texture etc. Being able to
acquire an image, process it and output a result within the FPGA can speed up this process
and result in more accurate sorting so fewer good parts are rejected and fewer bad parts
are accepted.

75

76

77

78

In Visual Servo Control the Vision system not only provides guidance to the motion system,
but it also provides continuous feedback to the motion system. This continuous feedback
eliminates the limitations of a vision guided motion system. Remember, In a vision guided
motion system, the vision system only provides guidance at the beginning of a task and it
provides no feedback after that. So, the accuracy of the task is entirely dependent on the
motion hardware. Hence, for a highly precise application you will need a very expensive,
highly precise and accurate motion hardware.

Lets take a look at how a visual servo control system might work.
*Click for Animation* In this animation, the vision system continuously captures images of
the actuator and the targeted part during the move until the move is complete. These
captured images are used to provide feedback on the success of the move. With this
feedback you can improve the accuracy and precision of your existing automation. You can
use lower cost robots, for high accuracy applications.

In order to explain how visual servo control works, lets go back to our centralized
processing architecture for Vision Guided Motion. *Click for Animation* Here if we
remove the Trajectory generator and have the vision system provide position set points to
the position loop directly, then we have a basic Visual Servo implementation.

This approach, where the vision system generates the position set points for the motion
system, is called the dynamic look and move approach. The performance requirements of
such an implementation can be met using fast real-time processors and/or FPGA
implementation. It is recommended to customize the implementation to your application.
Customization will remove the performance overheads of a generic solution.

Lets also look at another type of visual servo control that is more advanced. Here the
standard feedback, for example the encoder feedback, is replaced by visual feedback.
Typically you have two cameras. One camera is an overhead camera that takes the images
of the actuator and the target, and then uses those to generate the position setpoints
The second camera is mounted on the actuator. It is also called an Eye in hand camera.
This camera and the visual processing path are part of the feedback loop. Hence,
customization and optimization are key to get the performance needed to close the loop. It
is recommended to focus the camera on a smaller area in order to reduce the amount of
pixels needed to be processed. It is also recommended to optimize by looking for standard
image features. High speed Real Time processing and FPGA solutions truly make these
implementations viable.

The third application category for processing images on FPGAs involves image preprocessing and offloading. In this category, the FPGA works in conjunction with a CPU to
process images. When pre-processing images, the image data travels through the FPGA,
which modifies or enhances the data, before sending it to the host for further processing
and analysis.
Co-processing implies that the image data is sent to the FPGA from CPU instead of a
camera. This scenario is most common for post-processing large batches of images once
they have been acquired. In this case the FPGA acts in conjunction with the CPU to quickly
process images.
Of the two scenarios, image pre-processing will be the most common approach for
LabVIEW FPGA for the near future.

84

Some examples of image preprocessing can be found in both the medical and manufacturing
industries.
One of the most exciting examples is using FPGAs to boost the speed and efficiency Optical
Coherence Tomography, or OCT. Optical Coherence Tomography, or OCT, is a technique for
obtaining sub-surface images of translucent or opaque materials at a resolution equivalent to a lowpower microscope. It is effectively optical ultrasound, imaging reflections from within tissue to
provide cross-sectional images. OCT is attracting interest among the medical community, because
it provides tissue morphology imagery at much higher resolution (better than 10 m) than other
imaging modalities such as MRI or ultrasound. A typical OCT system uses a linescan camera and
special light source that sweep across a tissue and image the surface beneath, one line at a time.
Once each line is acquired, the data is scaled and converted to the frequency domain, where the
data is further manipulated and combined with other lines to reveal a high resolution, 3D picture of
a tissue. Several large NI Vision customers are already using LabVIEW to map human tissue
(especially the retina in human eyes) and have expressed interest in speeding up their processes
with FPGAs.
Moving to industrial inspection, there are many applications today that use brute force to check for
defects over large and continous areas. Such as system can be seen here from Basler that inspects
sheets of glass for flaws. The system first corrects for lighting variations across the glass, then
thresholds the image to extract any particles on the sheets. It then analyzes the particles to
determine the type and severity of the defects. In order to keep up with production, the System
uses 9 linescan cameras going to 9 individual PCs. Trying to program, synchronize and coordinate
one inspection over 9 PCs is a challenge that could be avoided by simply connecting all 9 cameras to
9 FPGA-enabled plug-in boards inside 1 PC or PXI chassis.

85

Just to set the record straight, not every vision function is appropriate for an FPGA, and
even the ones that are possible to port to an FPGA will not be available for a while. This
means we need to make sure the applications were targeting lend themselves to the FPGA
IP and capabilities that are fairly straightforward with LabVIEW FPGA. These functions are
often used to pre-process an image, which enhances it before sending it to the host CPU.
The list includes:
Image,
Pixel-based operations like thresholding, color analysis, and shading correction.
Line-based operations like 1D FFTs and edge detection
Region-based operations like Bayer decoding and filtering, and finally,
Image-based operations such as warp, flip, rotate, add and subtract

86

FPGA is harder to use for these applications

Brute-force pattern matching possible, but geometric matching less likely (that extracts
contour or geometric/extration curves can be implemented, but not the higher level about
how these curves form a shape, and searching for them)

There can still be benefits to pre-process images and use them in conjunction with CPUs or
other processors that are better suited for these high-level tasks.

87

It turns out that the graphical system design approach is ideal for programming FPGAs as
well. With their inherent ability to execute code in parallel and their dependency on the
flow of data makes representation in a graphic flow chart system an intuitive and powerful
tool. For example, commonly required FPGA application elements such as event counters,
analog IO and DMA are easily reduced to simple diagram nodes. This drastically improves
productivity.

88

89

Ill first start by saying that much of the functionality you take for granted with a standard
framegrabber must be created explicitly when programming in FPGA.
In this slide we are showing some of the logic that drives the acquisition of pixel data. The
upper data path maps bits on the Camera Link connectors to pixel data. This example
outputs one pixel per clock cycle, however it can be adapted to output multiple pixels per
clock for cameras with multiple taps. The CL to PIX VI sends data on every clock cycle,
even when valid pixels are not present.
The lower data path in this VI is responsible for controlling the timing of image acquisition.
The acquisition state machine monitors the camera timing signals and generates image
timing to be used by subsequent VIs. The state machine signals the start of each frame and
line, when pixel data is actually valid, and ensures processing does not start in the middle
of a frame. The acquisition window input allows the user to select only a portion of the
image sent from the camera.

90

Lets jump ahead for a moment and talk about transferring image data to the host.
Although the focus of this presentation is on FPGA processing, not using an FPGA as a basic
framegrabber, many applications will require at least some image data to be transferred to
the host.
This example shows one method for transferring data. The image transfer VI receives pixel
data and timing information from the acquisition VI we mentioned earlier. The data is sent
to the host system via a DMA FIFO, which we monitor for overflow. For cameras that send
only 1 or 2 pixels, we can utilize the DMA transfer efficiently by packing multiple pixels into
larger data words, over several clock cycles.
Notice also that this loop must run at the camera pixel rate or faster to ensure the
acquisition FIFO does not overflow.

91

Thresholding is a very common vision operation. The function looks at each pixel in the
image. All pixels within a user defined range are set to 1 and these would generally be
objects of interest. All pixels outside the range are set to 0 (background).

92

Stepping up a level in complexity are 2D kernel operations. Common examples are filtering
and edge extraction. For these operations, the value of a pixel depends on the values of
other neighboring pixels. Here we have one example, which happens to be a smoothing or
low pass filter.
The pixel being processed is at the center of the M x N kernel, or 3 x 3 in this case. The
sum of all pixels is computed, with each pixel weighted by the corresponding value in the
kernel. Finally, the pixel sum is divided by the sum of the elements in the kernel.

93

Is IP readily available or can you generate yourself.

94

95

Now that youre somewhat familiar with the applications and opportunities that FPGA
Vision opens up, lets talk about the actual products we are building to address these
applications.
The first product to release will be a Camera Link Adaptor Module for FlexRIO. We are
showing it at NIWeek and plan to release it later this year. The Module will provide 2
Camera Link connectors, which means you can connect a single base, medium or fullconfiguration Camera Link camera and reach data rates into the FPGA of 850 MB/s. Keep in
mind that the current FlexRIO module is based on PXI, not PXIe, so you wont be able to
stream image to the host at those speeds.
With that in mind, early applications will probably either reduce the data before sending it
to the host (pre-processing) or analyze and output a result entirely from the FPGA (highspeed control). Laser alignment and tracking are good examples of applications were
FlexRIO seems especially suited.
Along with the hardware, the FlexRIO Adaptor Module will include a few LabVIEW FPGA
examples showing how to acquire data from a few different cameras, manipulate the data
into pixels, lines, etc, process the images and access the I/O.
Its very important to realize that FlexRIO and FPGA based image processing is not a drop
in replacement for NI-IMAQ framegrabbers and the NI Vision.

96

97

98

You might also like