Report

IMPLEMENTATION OF MINI-PACS IN HEALTH CARE SYSTEMS
Chapter 1
LITERATURE SURVEY
Intensive studying of related areas was undertaken to get the clear concepts of the
entire project work.
1.1 Motivation
The survey of the following concepts has clearly made the working on the project
much easier. It has led to a tremendous amount of enthusiasm in gaining the knowledge
of a system which has gained importance in the field of Medical Images. The whole
project is more inspiring since it is linked to the lives of the many patients and provides
real-time collaboration among physicians and eminently enhances productivity and
patient coordination.
1.2 Medical Images
In the project the PACS system mainly deals with the Medical Image management
system and the display of the image in the same manner as acquired, from the place of
acquisition to the destination system. In order to transfer the large sized medical images,
the image must be compressed first and transferred using a suitable protocol. Hence there
is need to study the storage size of the Medical Images and different compression
standard to reduce the large sized Images and the Protocol which supports the format.
The size of large Medical images often exceeds the display area of the user's
output device. To present such images appropriately sophisticated image browsing
techniques has to be developed. Digital images are described as bitmaps formed of
individual pixels. The semantic content, or structural information, is not preserved in the
representation. As a result, images cannot be revised. Digital images result from either
real world capture or computer generation. They can be captured from the real world
1
through scanning or the use of a digital camera. Computer generation can be performed
with the use of a paint program, screen capture, or the conversion of a graphic into a
bitmap image.
The motivation for the compression of Medical images is illustrated through the
use of Table1.2.1. This Table shows the storage size, transmission bandwidth, and
transmission time needed for various types of uncompressed images. It is clear from these
values, that images require much storage space, large transmission bandwidths, and long
transmission times. With the present state of technology, the only solution is to compress
images before their storage and transmission. Then, at the receiver end, the compressed
images can be decompressed.
Image Size Bits/ Uncompressed Transmission Transmission
Type Pixel Size Bandwidth Time(Aprx.)
Gray 512 x 512 8bpp 262Kb 2.1 Mb/image 1min 13s
scale
Color 512 x 512 24bpp 786Kb 6.29 Mb/image 3min 39s
Medical 2048 x 2048 12bpp 5.16 Mb 41.3 Mb/image 23min 54s

Table 1.2.1. Storage size and Transmission Bandwidth for various Images
The JPEG standard has been in use for almost a decade now. It has proved a
valuable tool during all these years, but it cannot fulfill the advanced requirements of
today. JPEG uses the Discrete Cosine Transform (DCT)-based method. With the
continual expansion of multimedia and Internet applications, the needs and requirements
of the technologies used grew and evolved. Today’s digital imagery is extremely
demanding, not only from the quality point of view, but also from the image size aspect.
Current Medical image size covers orders of magnitude, ranging from the size of less than
100 Kb to high quality scanned images of approximate size of 40 Gb. The JPEG 2000
international standard represents advances in image compression technology where the

2
image coding system is optimized not only for efficiency, but also for scalability and
interoperability in network and mobile environments. Digital imaging has become an
integral part of the Internet, and JPEG2000 is a powerful new tool that provides power
capabilities for designers and users of networked image applications. After compression
there was a need to transfer the image. The JPEG2000 standard uses the JPIP Protocol for
image browsing. As PACS system involves compression of the Medical Images &
transmission on the Web, the JPEG2000 technique fulfills the need of the system.
1.3 Preamble
In Medical Imaging, PACS (Picture Archiving and Communication System) is an
integrated system of digital products and technology allowing for acquisition, storage,
retrieval, and display of radiographic images.
The key components of PACS are modality interfaces, a network backbone, a
database management system, an image management system, a long-term archive, and
diagnostic and clinical workstations. A PACS includes interfaces with the hospital
information system (HIS) and radiology information system (RIS). A web server,
allowing Internet access, is also a strategic component of PACS.
The medical images are stored in an independent format. The most common
format for image storage is DICOM (Digital Imaging and Communications in Medicine).
Digital Imaging and Communications in Medicine (DICOM) is a comprehensive set of
standards for handling, storing and transmitting information in medical imaging. It
includes a file format definition and a network communication protocol.
The goal of the project is to compress the medical images using the JPEG2000
format. The JPEG2000 uses the wavelet transform for compression & retrieval of the
compressed image, which is then stored in the server & the compressed images are
transferred over the network for viewing through the JPIP protocol.
3
PACS replaces hard-copy based means of managing medical images, such as film
archives. It expands on the possibilities of such conventional systems by providing
capabilities of off-site viewing and reporting (tele-education, tele-diagnosis).
Additionally, it enables practitioners on various physical locations to peruse the same
information simultaneously. With the ever-decreasing price of digital storage, PACS
systems are overwhelmingly cost-effective.
1.4 Problem Formulation
Despite the increase in the use of image modalities that allow the accomplishment
of cuts us part, such as the computerized cat scan (CT), the ultrasound (US) and the
magnetic resonance (MR), which, in general way, supply images in digital format, the
examinations of conventional radiology continue representing 70% of the examinations
carried through in a radiology department. With this system it does not have reduction of
time or work, they continue existing problems and still add bigger risk of errors is
associated.
The success of any healthcare service is dependent on the efficient use and sharing
of patient information. The problem statement is, “Diagnostic images are frequently lost,
misplaced, and unread. The diagnostic images and information is not available anywhere,
the cost involved is also more, which involves decrease in the customer service and
efficiency”. There was a requirement for image processing, handling, and display,
particularly in the same color and motion is needed in diagnostic images.
In pursuance of the above goals, we have decided to implement a system, in which
the Medical Images are compressed, manipulated & Metadata are added to the Diagnosed
portion of the image & then stored, which will be transferred from the place of acquisition
from the storage system, to the station for further diagnosis’s using “Implementation of
mini-PACS in Healthcare Systems”.
4
1.5 Scope of the Project
Medical imaging is important and widespread in the diagnosis of disease. In
certain situations, however, the particular manner in which the images are made available
to physicians and their patients introduces obstacles to timely and accurate diagnosis of
disease. These obstacles generally relate to the fact that each manufacturer of a medical
imaging system uses different and proprietary formats to store the images in digital form.
This means, for example, that images from a scanner manufactured by General Electric
Corp. are stored in a different digital format compared to images from a scanner
manufactured by Siemens Medical Systems. Further, images from different imaging
modalities such as ultrasound and MRI are stored in formats different from each other. In
practice, viewing of medical images typically requires a different proprietary
"workstation" for each manufacturer and for each modality.
In principle, medical images could be converted to Internet web pages for
widespread viewing. Several technical limitations of current Internet standards, however,
create a situation where straightforward processing of the image data results in images
which transfer across the Internet too slowly, lose diagnostic information, or both. One
such limitation is the bandwidth of current Internet connections which, because of the
large size of medical images, result in transfer times which are unacceptably long. The
problem of bandwidth can be addressed by compressing the image data before transfer,
but compression typically involves loss of diagnostic information. In addition, due to the
size of the images the time required to process image data from an original format to a
format which can be viewed by Internet browsers is considerable, meaning that systems
designed to create web pages "on the fly" introduce a delay of seconds to minutes while
the person requesting to view the images waits for the data to be processed. Workstations
allow images to be reordered or placed "side-by-side" for viewing but again an Internet
5
system would have to create new web pages "on the fly" which would introduce further
delays. Finally, diagnostic interpretation of medical images requires the images are
presented with appropriate brightness and contrast. On proprietary workstations these
parameters can be adjusted by the person viewing the images but control of image
brightness and contrast are not features of current Internet standards (http of html).
Chapter 2
6
PICTURE ARCHIVING AND COMMUNICATION

SYSTEMS (PACS)
2.1 Introduction to PACS
PACS (Picture Archiving and Communication Systems) are high-speed, graphical,
computer network systems for the storage, retrieval, and display of radiologic images.
PACS are electronic medical image management systems. They consist of image display
systems, archiving systems, networks, and interfaces, presenting one unified system to the
user.
Picture - Digital diagnostic image (radiological)
Archiving - Electronic storage & retrieval
Communication - Computer network (multiple access)
Systems - Control of the processes (integrated technology)
Figure 2.1.1: PACS system
The images are acquired, compressed, archived and retrieved over a network for
diagnosis and review by physicians. These images can be interpreted and viewed at
workstations, which can also double as archive stations for image storage. The
introduction of client/server computing, improved digital imaging and computer network
7
technologies, along with the advancement of the DICOM and HL7 standards have put
PACS along side radiology information systems (RIS) as an ideal solution for managing
radiological images.
One of the main benefits that PACS provides is the ability to provide a timely
delivered and efficient access to images, interpretations and related data throughout the
organization. This helps to ease consultations between physicians who can now
simultaneously access the same images over networks, leading to a better diagnosis
process. It is also beneficial to physicians in emergency situations, as they need not wait
for long periods in order to view a patient’s radiological images as these are instantly
available on the network when ready. Another feature of PACS is the ability to digitally
enhance the images, providing more detailed and sharper images. This improves
diagnostic capabilities at radiological examinations.
The high costs of PACS has led to vendors offering mini-PACS, which is a cheap
alternative for organizations that cannot afford the cost of a full PACS system or those
seeking to implement seeing to implement some form of a digital image management
system but would rather start off with something small. While PACS are considered to be
at a minimum hospital wide, mini-PACS usually tend to be departmental-based
(radiology, emergency room, or orthopedics). Mini-PACS are easy to maintain and cheap
to repair, and they can gradually be upgraded to a fully functioning hospital wide PACS.
2.2 Types of PACS
Full PACS handle images from various modalities, such as ultrasonography,
radiography, magnetic resonance imaging, positron emission tomography, computed
tomography and plain X-rays. Small-scale systems that handle images from a single
8
CT MRI Ultrasound Laser Scanner
Image Database
Workstation Workstation
Review-station Review-station
Figure 2.2.1: The Entire PACS System
modality (usually connected to a single acquisition device) are sometimes called mini-
PACS.
The key components of PACS are modality interfaces, a network backbone, a
database management system, an image management system, a long-term archive, and
diagnostic and clinical workstations. A PACS includes interfaces with the hospital
information system (HIS) and radiology information system (RIS). A web server,
allowing Internet access, is also a strategic component of PACS.
Typically a PACS network consists of a central server which stores a database
containing the images. This server is connected to one or more clients via a LAN a local
area network. The PACS is connected to an interface engine and receives orders for
diagnostic studies, which it then matches to image sets coming into the PACS from the
digital modalities (CT, CR, and MR, etc.) via DICOM (a digital imaging communications
9
standard) in order to ensure that all images are associated with the right patient. To
process these order messages successfully, the PACS must receive from the RIS
admission/discharge/transfer messages about patients. Finally, the PACS receive
electronically signed reports from the RIS, which it then archives with the images so that
reports and images may be retrieved and displayed concurrently. For scanning image
films into the system, printing image films from the system and interactive display of
digital images. PACS workstations offer means of manipulating the images (crop, rotate,
zoom, brightness, contrast and others).
2.3 Storage
Image storage and communication can be based on either a centralized or
distributed architecture. In centralized storage system all the acquired images are forward
to a central archive system to which every modality or workstation is attached on a point-
to-point basis. Whereas a distributed architecture is composed of linked local storage
subsystems or file servers. Each server has its own short-term storage unit (usually a
small RAID), one or more image acquisition modalities, and several diagnostic/review
workstations. Each of these architectures has its own advantages and disadvantages.
However distributed storage architecture has been found suitable for large-scale PACS
and centralized architecture for miniPACS.
Storage Media: PACS storage devices should hold gigabytes of data with
relatively efficient access time. Research continues to consistently improve PACS by
providing storage media that can hold many images and have quick access time. A PACS
needs at least two levels of archive (short-term and long-term). Images should be
retrievable from the short-term archive in 2 seconds. Images from the long-term archive
should take no more than 3 minutes to retrieve.
10
Figure 2.3.1: Storage subsystem in a distributed large scale PACS
Examples of storage media that can be used for PACS archiving include:
• Redundant array of inexpensive disks (RAID) for immediate access of current
images.
• Magnetic disks for faster retrieval of cached images.
• Erasable magneto-Optical disks for temporary long term archive.
• Write once read many (WORM) in the optical disk library, which constitute the
permanent archive.
• Recently developed digital versatile discs (DVD-ROM) for low cost permanent
archive.
• And the digital linear tapes for backup storage
11
2.4 Multiple Modality Interface
First generation modality interfaces were achieved primarily through video
acquisition techniques, whereas second generation interfaces made use of ACRNEMA
V2, proprietary digital and early DICOM storage interfaces. Such interfaces allowed for
the exportation of image information into the PACS. The “double entry” problem that of
entering one set of patient information into the RIS and an incorrect variant of that same
information into the modality was not significantly addressed by first or second
generation systems. Some second generation modality interface gateways (“protocol
converters”) made use of DICOM modality worklist interfaces to rationalize the RIS and
modality patient information. However, such interfaces were the exceptions to the rule.
Modality image acquisition, as a technical problem in PACS, has largely been
solved by the widespread implementation of the DICOM standard and the availability of
image “protocol converters,” used to interface pre-DICOM modalities. The modality
interface problem facing third generation PACS is that of truly solving the “double entry”
problem and improving workflow. Although most, if not all, modalities manufactured
today support the DICOM standard, they support only the DICOM storage SOP portion
of the standard. A smaller number of modalities support DICOM worklist management
SOP or DICOM detached study management SOP, which are required to solve the
“double entry” problem; even fewer modalities manufactured today support the DICOM
modality performed procedure step SOP, which is used to track the status of image
acquisition workflow.
The next generation modality interfaces (“protocol converters”) will have to
address these informatics problems. Such devices will have to provide for the acquisition
of images, the rationalization of the patient information with the RIS patient information,
and the reporting of modality workflow status. These interfaces will provide a complete
12
DICOM interface to the modality. In the case of some modalities, these interfaces will be
supplied as a part of the modality itself.
The best examples of such modality interfaces can be found in the world of
computed radiography. These interfaces present a modality worklist to the operator,
associate records from the worklist with the digitized images provide image quality
control, transmit the digitized images to multiple destinations, print images to both local
and network printers, and report modality workflow status.
2.5 Application Level
The Integrated Medical Application (IMA) gives the physician access to medical
services and functionality from a single graphical desktop. The desktop offers services for
the local and remote access to electronic patient records of hospitals, specialist clinics,
and general practitioners. The distributed patient records within a hospital or practice are
logically combined by means of a meta-patient record. A meta-patient record provides
information about the local multimedia patient documents. In addition to basic data such
as document type, location and creation date, the record provides information concerning
the document structure. The information supplied by the meta-record from each practice
and hospital can be combined to form the complete, virtual patient record. The
management and access to each record and each document is carried out by the Open
Distributed Management System. An object-oriented model has been used throughout.
Each local implementation of the record may be different, depending on the facilities of
the local environment, but each presents the same external interface to the outside. This
mechanism enables the scalable integration over a wide area of all patient documents.
The central components of the IMA provide functionality for the transparent
access of local and remote documents obtained through selected meta-records services,
the navigation in the patient record, the visualization of multimedia documents, and the
13
processing of images. Other tools that have been integrated in the graphical desktop
include advanced image processing capabilities e.g. for quantification of stenosis,
communications services such as email, text processing, desktop conferencing, and access
to external information sources such as the Internet World-Wide Web.
The applications of PACS are as follows:-
• Rapid access to critical information to decrease exam-to-diagnosis time. This is
especially useful in emergency and operating rooms.
• Elimination of film, handling and storage costs.
• Images can be easily shared between reading radiologists, other physicians and
medical records.
• Radiologists can access soft-copy images instantly after acquisition to expedite
diagnosis and reporting at the almost any available workstation.
• Web servers can be used to most cost-effectively share images with other
departments, even referring physicians across town. They can access the images
using the Internet or the local intranet.
• Hardcopy films or paper printouts can be made when needed for traditional
archiving or the provision of images to other departments.
Images can be archived at secure locations using database servers manages the
transfer, retrieval and storage of images and relevant information, the archive provides
permanent image storage.
14
2.6 Network
Topology refers to the way the network is laid out physically or logically. Two or more
devices connect to a link, and then two or more links form a topology. Five basic
topologies are possible: Bus, star, tree, mesh and ring.
PACS communication networks enable the movement of medical data between
modality imaging devices, gateway computers, PACS server, display workstations,
remote locations for diagnosis and consultation and other Hospital information systems
like HIS/RIS.
The most commonly used network technologies in building PACS networks are:
• The Ethernet based on IEEE standard 802.3, Carrier Sense Multiple Access with
Collision Detection (CSMA/CD) protocol. Suitable for LANs and can operate at
10Mbits/Sec on half-inch coaxial cable, twisted pair wire or fiber optic cables.
• FDDI can be used for medium speed communication. Runs on optic fiber at
100Mbits/sec over a distance of 200kms with upto 1000 stations connected.
• ATM: Can be used to combine LAN and WAN application. ATM is a Virtual
circuit- oriented packet switching network with transmission speed ranging from
51.84 Mbits/sec – 2.5 Gbits/sec.
Conceptually three main types of networks are used to transport radiology images:
• A LAN linking imaging devices, data storage units and display devices within one
departmental area.
• A larger LAN for intra-hospital transport linking departments,
• A tele-radiology network for transmission of images to other hospitals in the
region or to and from remote sites for diagnosis at a distance.
15
2.7 Digital Imaging and Communications in Medicine
2.7.1 Medical Imaging Standard
The Digital Imaging and Communications in Medicine (DICOM) standard was
created by the National Electrical Manufacturers Association (NEMA) to aid the
distribution and viewing of medical images, such as CT scans, MRIs, and ultrasound.
Digital Imaging and Communications in Medicine (DICOM) is a comprehensive
set of standards for handling, storing and transmitting information in medical imaging. It
includes a file format definition and a network communications protocol. This protocol is
an application protocol; it uses TCP/IP to communicate between systems. DICOM files
can be exchanged between two entities that have the capability to receive the information
- image and patient data - in DICOM format.
DICOM was developed to enable integration of scanners, servers, workstations
and network hardware from multiple vendors into a picture archiving and communication
system. The different machines, servers and workstations come with DICOM
conformance statements which clearly state the DICOM classes supported by them.
DICOM has been widely adopted by hospitals and is making inroads in smaller
applications like dentist's and doctor's offices. DICOM is categorized into two different
transmissions; DICOM Store and DICOM Print. DICOM Store is a format to send to a
PACS System or Workstation. DICOM Print is a format to send to a DICOM Printer,
normally to print an "X-Ray" film. Most vendors require individual licenses to perform
these types of transmissions. This standard provides the sender image quality control over
the image being sent.
Despite of the diversity of sources for digital medical imaging (CR, Digital X-Ray
detectors, MRI, CT, PET, SPECT, Ultrasound, etc.) all modalities are usually encoded in
16
DICOM format. DICOM stands for Digital Image and Communications in Medicine and
it is a format world-wide accepted by medical users and vendors of medical image
acquisition devices. The consolidation of DICOM has leveraged the development of
computer applications for the processing of medical image studies, and has eased the set
up of collaborative environments, where medical images, preserving their clinical value,
can be understood by different computer applications.
2.7.2 Why was DICOM created?
The standard makers were used to feel big confidence while giving data
interchange and communication support because they pushed the clients to buy
equipment of the same company.
Imagine that we wanted to exchange a picture from a TAC (Tomography
computerized equipment) with another user that was using a radiotherapy planification
system. Then we might have rewritten all the software code in the planification system so
it would be aloud to read the picture. The same would happen if we wanted to update the
TAC system with the picture from the planification system. DICOM was created to solve
these problems of compatibility.
2.7.3 DICOM Standard
This Standard, now designated Digital Imaging and Communications in Medicine
(DICOM), embodies a number of major enhancements -
a. It is applicable to a networked environment. DICOM supports operation in
a networked environment using industry standard networking protocols
such as OSI and TCP/IP.
b. It specifies how devices claiming conformance to the Standard react to
commands and data being exchanged. DICOM specifies, through the
17
concept of Service Classes, the semantics of commands and associated
data.
c. It specifies levels of conformance. DICOM explicitly describes how an
implementer must structure a Conformance Statement to select specific
options.
d. It is structured as a multi-part document. This facilitates evolution of the
Standard in a rapidly evolving environment by simplifying the addition of
new features. ISO directives which define how to structure multi-part
documents have been followed in the construction of the DICOM
Standard.
e. It introduces explicit Information Objects not only for images and graphics
but also for studies, reports, etc.
f. It specifies an established technique for uniquely identifying any
Information Object. This facilitates unambiguous definitions of
relationships between Information Objects as they are acted upon across
the network.
2.7.4 Overview of DICOM Standard
The contents of the DICOM standard go far beyond a definition of an exchange
format for medical image data. DICOM defines
• Data structures (formats) for medical images and related data,
• Network oriented services, e. g.
o image transmission,
o query of an image archive (PACS),
18
o print (hardcopy), and
o RIS - PACS - modality integration
• Formats for storage media exchange, and
• Requirements for conforming devices and programs.
2.7.5 DICOM data structures
A DICOM image consists of a list of data elements (so-called attributes) which
contain a multitude of image related information:
• Patient information (name, sex, identification number),
• Modality and imaging procedure information (device parameters, calibration,
radiation dose, contrast media), and
• Image information (resolution, windowing).
For each modality, DICOM precisely defines the data elements that are required,
optional (i. e. may be omitted) or required under certain circumstances (only if contrast
media was used). This powerful flexibility is at the same time, however, one crucial
weakness of the DICOM standard because practical experience shows that image objects
are frequently incomplete. In such objects, required fields are missing or contain incorrect
values. These problems can lead to subsequent problems when exchanging data.
2.7.6 DICOM network services
The DICOM network services are based on the client/server concept. In case two
DICOM applications want to exchange information, they must establish a connection and
agree on the following parameters:
19
Who is client and who is server,
• which DICOM services are to be used, and
• in which format data is transmitted (e. g. compressed or uncompressed).
Only if both applications agree on a common set of parameters, the connection can
and will be established. In addition to the most basic DICOM service "image
transmission" (or in DICOM terminology: "Storage Service Class") there are number of
advanced services, e. g:
• The DICOM image archive service ("Query/ Retrieve Service Class") allows to
search images in a PACS archive by certain criteria (patient, time of creation of
the images, modality etc.) and to selectively download images from this archive.
• The DICOM print service ("Print Management Service Class") allows to access
laser cameras or printers over a network, so that multiple modalities and
workstations can share one printer.
• The DICOM modality worklist service allows to automatically downloading up-
to-date worklists including a patient's demographic data from an information
system (HIS/RIS) to the modality.
2.7.7 Media exchange
In addition to the exchange of medical images over a network, media exchange has
become another focus which has been integrated into the DICOM standard. Fields of
application are for example the storage or cardiac angiography films in cardiology or the
storage of ultrasound images. In order to make sure that DICOM storage media are really
interchangeable, the standard defines so-called "application profiles" which explicitly
define
20
• images from which modalities may be present on the medium (e. g. 'only X-Ray
Angiography images'),
• which encoding formats and compression schemes may be used (e. g. only
uncompressed or loss-less JPEG),
• Which storage medium is to be used (e. g. 'CD-R with ISO file system').
Apart from the image files, each DICOM medium contains a so-called "DICOM
directory". This directory contains the most important information (patient name,
modality, unique identifiers etc.) for all images which are captured on the medium. With
the necessary help of this directory, it is possible to quickly browse or search through all
images on the medium without having to read the complete image files - which would for
instance take a couple of minutes when reading from a CD.
2.7.8 DICOM File Format
A single DICOM file contains both a header (which stores information about the
patient's name, the type of scan, image dimensions, etc), as well as all of the image data
(which can contain information in three dimensions). DICOM image data can be
compressed (encapsulated) to reduce the image size. Files can be compressed using lossy
or lossless variants of the JPEG format.
DICOM files consist of a header with standardized as well as free-form fields and
a body of image data. A single DICOM file can contain one or more images, allowing
storage of volumes and/or animations. Image data can be compressed using JPEG
Standard.
DICOM differs from other data formats in that it groups information together into
a data set. That is, an X-Ray of your chest is in the same file as your patient ID, so that
21
the image is never mistakenly separated from your information. It also mandates the
presence of a media directory, the DICOMDIR file that provides index and summary
information for all the DICOM files on the media.
DICOM restricts the filenames on DICOM media to 8 character names
(sometimes 8.3). This is a common source of problems with media created by developers
that did not read the specifications carefully. This is a historical requirement to maintain
compatibility with older existing systems. The DICOMDIR information provides
substantially greater information about each file than any filename could, so there is less
need for meaningful file names.
22
Chapter 3
WAVELET TRANSFORM
3.1 Wavelet Transform
The transform of a signal is just another form of representing the signal. It does
not change the information content present in the signal. The Wavelet Transform provides
a time-frequency representation of the signal. It was developed to overcome the short
coming of the Short Time Fourier Transform (STFT), which can also be used to analyze
non-stationary signals. While STFT gives a constant resolution at all frequencies, the
Wavelet Transform uses multi-resolution technique by which different frequencies are
analyzed with different resolutions.
A wave is an oscillating function of time or space and is periodic. In contrast,
wavelets are localized waves. They have their energy concentrated in time or space and
are suited to analysis of transient signals. While Fourier Transform and STFT use waves
to analyze signals, the Wavelet Transform uses wavelets of finite energy.
Figure 3.1.1 : (a) a Wave (b) a Wavelet
The wavelet analysis is done similar to the STFT analysis. The signal to be
analyzed is multiplied with a wavelet function just as it is multiplied with a window
23
function in STFT, and then the transform is computed for each segment generated.
However, unlike STFT, in Wavelet Transform, the width of the wavelet function changes
with each spectral component. The Wavelet Transform, at high frequencies, gives good
time resolution and poor frequency resolution, while at low frequencies, the Wavelet
Transform gives good frequency resolution and poor time resolution.
3.2 The Continuous Wavelet Transform and the Wavelet Series
The Continuous Wavelet Transform (CWT) is provided by equation 3.2.1, where
x(t) is the signal to be analyzed. y(t) is the mother wavelet or the basis function. All the
wavelet functions used in the transformation are derived from the mother wavelet through
translation (shifting) & scaling (dilation or compression).
X W
_1 _ _
√ 
_t τ_- _
(Tτ . s = ) | s |∫ x ( y *t ) s d t
(3.2.1)
The mother wavelet used to generate all the basis functions is designed based
on some desired characteristics associated with that function. The translation parameter τ
relates to the location of the wavelet function as it is shifted through the signal. Thus, it
corresponds to the time information in the Wavelet Transform. The scale parameter s is
defined as |1/frequency| and corresponds to frequency information. Scaling either dilates
(expands) or compresses a signal. Large scales (low frequencies) dilate the signal and
provide detailed information hidden in the signal, while small scales (high frequencies)
compress the signal and provide global information about the signal. The Wavelet
Transform merely performs the convolution operation of the signal and the basis function.
The above analysis becomes very useful as in most practical applications, high
frequencies (low scales) do not last for a long duration, but instead, appear as short bursts,
while low frequencies (high scales) usually last for entire duration of the signal.
24
The Wavelet Series is obtained by discretizing CWT. This aids in computation of
CWT using computers and is obtained by sampling the time-scale plane. The sampling
rate can be changed accordingly with scale change without violating the Nyquist
criterion. Nyquist criterion states that, the minimum sampling rate that allows
reconstruction of the original signal is 2ω radians, where ω is the highest frequency in the
signal. Therefore, as the scale goes higher (lower frequencies), the sampling rate can be
decreased thus reducing the number of computations.
3.3 The Discrete Wavelet Transform
The Wavelet Series is just a sampled version of CWT and its computation may
consume significant amount of time and resources, depending on the resolution required.
The Discrete Wavelet Transform (DWT), which is based on sub-band coding, is found to
yield a fast computation of Wavelet Transform. It is easy to implement and reduces the
computation time and resources required.
The foundations of DWT go back to 1976 when techniques to decompose discrete
time signals were devised. Similar work was done in speech signal coding which was
named as sub-band coding. In 1983, a technique similar to sub-band coding was
developed which was named pyramidal coding. Later many improvements were made to
these coding schemes which resulted in efficient multi-resolution analysis schemes.
In CWT, the signals are analyzed using a set of basis functions which relate to
each other by simple scaling and translation. In the case of DWT, a time-scale
representation of the digital signal is obtained using digital filtering techniques. The
signal to be analyzed is passed through filters with different cutoff frequencies at different
scales.
3.4 Classification of wavelets
We can classify wavelets into two classes: (a) orthogonal and (b) biorthogonal.
25
Based on the application, either of them can be used.
(a) Features of orthogonal wavelet filter banks –
The coefficients of orthogonal filters are real numbers. The filters are of the same length
and are not symmetric. The low pass filter, G0 and the high pass filter, H0 are related to
each other by
H0 (z) = z -N G0 (-z-1) (3.4.1)
The two filters are alternated flip of each other. The alternating flip automatically
gives double-shift orthogonality between the low pass and high pass filters, i.e., the scalar
product of the filters, for a shift by two is zero. i.e.,
∑G[k] H[k-2l] = 0,
where k, lЄZ. Perfect reconstruction is possible with alternating flip. Also, for perfect
reconstruction, the synthesis filters are identical to the analysis filters except for a time
reversal. Orthogonal filters offer a high number of vanishing moments. This property is
useful in many signal and image processing applications. They have regular structure
which leads to easy implementation and scalable architecture.
(b) Features of biorthogonal wavelet filter banks -
In the case of the biorthogonal wavelet filters, the low pass and the high pass filters do not
have the same length. The low pass filter is always symmetric, while the high pass filter
could be either symmetric or anti-symmetric. The coefficients of the filters are either real
numbers or integers.
For perfect reconstruction, biorthogonal filter bank has all odd length or all even
length filters. The two analysis filters can be symmetric with odd length or one symmetric
and the other anti-symmetric with even length. Also, the two sets of analysis and
synthesis filters must be dual. The linear phase biorthogonal filters are the most popular
filters for data compression applications.
26
3.5 Wavelet Families
There are a number of basis functions that can be used as the mother wavelet for
Wavelet Transformation. Since the mother wavelet produces all wavelet functions used in
the transformation through translation and scaling, it determines the characteristics of the
resulting Wavelet Transform. Therefore, the details of the particular application should be
taken into account and the appropriate mother wavelet should be chosen in order to use
the Wavelet Transform effectively. Figure 4.5.1 illustrates some of the commonly used
wavelet functions. Haar wavelet is one of the oldest and simplest wavelet. Therefore, any
discussion of wavelets starts with the Haar wavelet. Daubechies wavelets are the most
popular wavelets. They represent the foundations of wavelet signal processing and are
used in numerous applications. These are also called Maxflat wavelets as their frequency
responses have maximum flatness at frequencies 0 and π. This is a very desirable property
in some applications.
Figure 3.5.1: Wavelet families (a) Haar (b) Daubechies4 (c) Coiflet1 (d) Symlet2 (e) Meyer
(f) Morlet (g) Mexican Hat.
27
The Haar, Daubechies, Symlets and Coiflets are compactly supported orthogonal
wavelets. These wavelets along with Meyer wavelets are capable of perfect
reconstruction. The Meyer, Morlet and Mexican Hat wavelets are symmetric in shape.
The wavelets are chosen based on their shape and their ability to analyze the signal in a
particular application.
3.6 DWT and Filter Banks
3.6.1 Multi-Resolution Analysis using Filter Banks :
Filters are one of the most widely used signal processing
functions. Wavelets can be realized by iteration of filters with rescaling.
The resolution of the signal, which is a measure of the amount of detail
information in the signal, is determined by the filtering operations, and
the scale is determined by up sampling and down sampling
(subsampling) operations. The DWT is computed by successive low
pass and high pass filtering of the discrete time-domain signal as
shown in Figure 4.6.1. This is called the Mallat algorithm or Mallat-tree
decomposition. Its significance is in the manner it connects the
continuous-time mutiresolution to discrete-time filters. In the Figure,
the signal is denoted by the sequence x[n], where n is an integer. The
low pass filter is denoted by G0 while the high pass filter is denoted by
H0. At each level, the high pass filter produces detail information, d[n],
while the low pass filter associated with scaling function produces
coarse approximations, a[n].
At each decomposition level, the half band filters produce signals
spanning only half the frequency band. This doubles the frequency
resolution as the uncertainty in frequency is reduced by half. In

28
accordance with Nyquist’s rule if the original signal has a highest
frequency of ω, which requires a sampling frequency of 2ω radians,
then it now has a highest frequency of ω/2 radians. It can now be
sampled at a frequency of ω radians thus discarding half the samples
with no loss of information. This decimation by 2 halves
Ho 2
d1[n]
Ho 2
X[n] d2[n]
2
Ho
Go 2
d3[n] Go 2
Go 2
a1[n]
Figure 3.6.1:Three-level Wavelet decomposition tree
the time resolution as the entire signal is now represented by only half
the number of samples. Thus, while the half band low pass filtering
removes half of the frequencies and thus halves the resolution, the
decimation by 2 doubles the scale.
With this approach, the time resolution becomes arbitrarily good at high
frequencies, while the frequency resolution becomes arbitrarily good at low frequencies.
The filtering and decimation process is continued until the desired level is reached. The
maximum number of levels depends on the length of the signal. The DWT of the original
29
signal is then obtained by concatenating all the coefficients, a[n] and d[n], starting from
the last level of decomposition. Figure4.6.2 shows the reconstruction of the original
d1[n]
2 H1
2 H1
d2[n]
d3[n] H1 X[n]
2 2 G1
2 G1
2 G1
a3[n]
Figure 3.6.2: Three-level Wavelet reconstruction tree.
signal from the wavelet coefficients. Basically, the reconstruction is the reverse process of
decomposition. The approximation and detail coefficients at every level are upsampled by
two, passed through the low pass and high pass synthesis filters and then added. This
process is continued through the same number of levels as in the decomposition process
to obtain the original signal. The Mallat algorithm works equally well if the analysis
filters, G0 and H0, are exchanged with the synthesis filters, G1
3.6.2 Conditions for Perfect Reconstruction
In most Wavelet Transform applications, it is required that the original signal be
synthesized from the wavelet coefficients. To achieve perfect reconstruction the analysis
and synthesis filters have to satisfy certain conditions. Let G0(z) and G1(z) be the low pass
analysis and synthesis filters, respectively and H0(z) and H1(z) the high pass analysis and
synthesis filters respectively. Then the filters have to satisfy the following two conditions
G0 (-z) G1 (z) + H0 (-z). H1 (z) = 0 (3.6.1)
G0 (z) G1 (z) + H0 (z). H1 (z) = 2z-d
30
The first condition implies that the reconstruction is aliasing-free and the second
condition implies that the amplitude distortion has amplitude of one. It can be observed
that the perfect reconstruction condition does not change if we switch the analysis and
synthesis filters.
There are a number of filters which satisfy these conditions. But not all of them
give accurate Wavelet Transforms, especially when the filter coefficients are quantized.
The accuracy of the Wavelet Transform can be determined after reconstruction by
calculating the Signal to Noise Ratio (SNR) of the signal. Some applications like pattern
recognition do not need reconstruction, and in such applications, the above conditions
need not apply.
3.7 Applications of Wavelets
There is a wide range of applications for Wavelet Transforms. They are applied in
different fields ranging from signal processing to biometrics, and the list is still growing.
One of the prominent applications is in the FBI, fingerprint compression standard.
Wavelet Transforms are used to compress the fingerprint pictures for storage in their data
bank. The previously chosen Discrete Cosine Transform (DCT) did not perform well at
high compression ratios. It produced severe blocking effects which made it impossible to
follow the ridge lines in the fingerprints after reconstruction. This did not happen with
Wavelet Transform due to its property of retaining the details present in the data.
In DWT, the most prominent information in the signal appears in high amplitudes
and the less prominent information appears in very low amplitudes. Data compression can
be achieved by discarding these low amplitudes. The wavelet transforms enables high
compression ratios with good quality of reconstruction. At present, the application of
wavelets for image compression is one the hottest areas of research. The Wavelet
31
Transforms have been chosen for the JPEG 2000 compression standard. Figure 3.7.1
shows application of wavelets in signal processing.
Wavelet
Input Signal Transform Inverse Wavelet Output Signal
Processing Transform
Figure 3.7.1: Signal processing application using Wavelet Transform
Chapter 4
IMPLEMENTATION OF JPEG2000
4.1 Embeded Zero –Tree Wavelet Transform
EZW (Embedded Zerotrees of Wavelet Transforms) is a lossy image compression
algorithm. At low bit rates (i.e. high compression ratios) most of the coefficients
produced by a sub-band transform (such as the wavelet transform) will be zero, or very
close to zero. This occurs because "real world" images tend to contain mostly low
frequency information (highly correlated). However where high frequency information
does occur (such as edges in the image) this is particularly important in terms of human
perception of the image quality, and thus must be represented accurately in any high
quality coding scheme.
By considering the transformed coefficients as a tree (or trees) with the lowest
frequency coefficients at the root node and with the children of each tree node being the
32
spatially related coefficients in the next higher frequency sub-band, there is a high
probability that one or more sub-trees will consist entirely of coefficients which are zero
or nearly zero, such sub-trees are called zero-trees. Due to this, we use the terms node and
coefficient interchangeably, and when we refer to the children of a coefficient, we mean
the child coefficients of the node in the tree where that coefficient is located. We use
children to refer to directly connected nodes lower in the tree and descendants to refer to
all nodes which are below a particular node in the tree, even if not directly connected.
In zero-tree based image compression scheme such as EZW and SPIHT, the intent
is to use the statistical properties of the trees in order to efficiently code the locations of
the significant coefficients. Since most of the coefficients will be zero or close to zero, the
spatial locations of the significant coefficients make up a large portion of the total size of
a typical compressed image. A coefficient (likewise a tree) is considered significant if its
magnitude (or magnitudes of a node and all its descendants in the case of a tree) is above
a particular threshold. By starting with a threshold which is close to the maximum
coefficient magnitudes and iteratively decreasing the threshold, it is possible to create a
compressed representation of an image which progressively adds finer detail. Due to the
structure of the trees, it is very likely that if a coefficient in a particular frequency band is
insignificant, then all its descendants (the spatially related higher frequency band
coefficients) will also be insignificant.
EZW uses four symbols to represent (a) a zero-tree root, (b) an isolated zero (a
coefficient which is insignificant, but which has significant descendants), (c) a significant
positive coefficient and (d) a significant negative coefficient. The symbols may be thus
represented by two binary bits. The compression algorithm consists of a number of
iterations through a dominant pass and a subordinate pass, the threshold is updated
(reduced by a factor of two) after each iteration. The dominant pass encodes the
33
significance of the coefficients which have not yet been found significant in earlier
iterations, by scanning the trees and emitting one of the four symbols. The children of a
coefficient are only scanned if the coefficient was found to be significant, or if the
coefficient was an isolated zero. The subordinate pass emits one bit (the most significant
bit of each coefficient not so far emitted) for each coefficient which has been found
significant in the previous significance passes. The subordinate pass is therefore similar to
bit-plane coding.
There are several important features to note. Firstly, it is possible to stop the
compression algorithm at any time and obtain an approximation of the original image, the
greater the number of bits received, the better the image. Secondly, due to the way in
SET THE INITIAL SUBORDINATE

ACQUIRED DOMINANT
THRESHOLD T0 PASS
IMAGE PASS
N THRESHOLD
THRESHOLD = > FINAL
THRESHOLD/2 THRESHOLD
Y
DISPLAY
Figure 4.1.1: Block Diagram Of EZW
which the compression algorithm is structured as a series of decisions, the same
algorithm can be run at the decoder to reconstruct the coefficients, but with the decisions
being taken according to the incoming bit stream. In practical implementations, it would
be usual to use an entropy code such as arithmetic code to further improve the
performance of the dominant pass. Bits from the subordinate pass are usually random
enough that entropy coding provides no further coding gain.
4.1.1 Implementation of EZW Algorithm

34
Embedded Zero tree algorithm is a simple yet powerful algorithm having the
property that the bits in the stream are generated in the order of their importance. The first
step in this algorithm is setting up an initial threshold.
Any coefficient in the wavelet is said to be significant if its absolute value is
greater than the threshold. In a hierarchical sub-band system, every coefficient is spatially
related to a coefficient in the lower band. Such coefficients in the higher bands are called
‘descendants’. This is shown in Figure 4.1.2.
Figure 4.1.2: Hierarchial Sub-Band System
If a coefficient is significant and positive, then it is coded as ‘positive significant’
(ps). If a coefficient is significant and negative, then it is coded as ‘negative significant’
(ns). If a coefficient is insignificant and all its descendants are insignificant as well, then
it is coded as ‘zero tree root’ (ztr). If a coefficient is insignificant and all its descendants
are not insignificant, then it is coded as ‘insignificant zero’ (iz).The algorithm involves
two passes – Dominant pass and Subordinate pass.
In the dominant pass, the initial threshold is set to one half of the maximum pixel
value. Subsequent passes have threshold values one half of the previous threshold. The
coefficients are then coded as ps, ns, iz or ztr according to their values. The important part
35
is that if a coefficient is a zerotree root, then the descendants need not be encoded. Thus
only the significant values are encoded.
In the subordinate pass, those coefficients which were found significant in the
dominant pass are quantized based on the pixel value. In the first pass, the threshold is
half of the maximum magnitude, so the interval is divided into two and the subordinate
pass codes a 1 if the coefficient is in the upper half of the interval and codes a 0 if the co-
efficient is in the lower half of the interval. Thus if the number of passes is increased the
precision of the co-efficient is increased. This is the first algorithm that is implemented in
the compression coding.
4.2 Huffman Coding
In computer science and information theory, Huffman coding is an entropy
encoding algorithm used for lossless data compression. The term refers to the use of
a variable-length code table for encoding a source symbol (such as a character in a file)
where the variable-length code table has been derived in a particular way based on the
estimated probability of occurrence for each possible value of the source symbol. It was
developed by David A. Huffman while he was a Ph.D.student at MIT, and published in
the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes".
Huffman coding uses a specific method for choosing the representation for each
symbol, resulting in a prefix code (sometimes called "prefix-free codes", that is, the bit
string representing some particular symbol is never a prefix of the bit string representing
any other symbol) that expresses the most common characters using shorter strings of bits
than are used for less common source symbols. Huffman was able to design the most
efficient compression method of this type: no other mapping of individual source symbols
to unique strings of bits will produce a smaller average output size when the actual
36
symbol frequencies agree with those used to create the code. A method was later found to
do this in linear time if input probabilities (also known as weights) are sorted.
Figure 4.2.1 :Huffman Coding
4.2.1 Basic Technique
The technique works by creating a binary tree of nodes. These can be stored in a
regular array, the size of which depends on the number of symbols, n. A node can be
either a leaf node or an internal node. Initially, all nodes are leaf nodes, which contain
the symbol itself, the weight (frequency of appearance) of the symbol and optionally, a
link to a parent node which makes it easy to read the code (in reverse) starting from a leaf
node. Internal nodes contain symbol weight, links to two child nodes and the optional link
to a parent node. As a common convention, bit '0' represents following the left child and
bit '1' represents following the right child. A finished tree has up to n leaf nodes and n −
1 internal nodes. A Huffman tree that omits unused symbols produces the most optimal
code lengths. The process essentially begins with the leaf nodes containing the
probabilities of the symbol they represent, then a new node whose children are the 2
nodes with smallest probability is created, such that the new node's probability is equal to
the sum of the children's probability. With the previous 2 nodes merged into one node
(thus not considering them anymore), and with the new node being now considered, the
procedure is repeated until only one node remains, the Huffman tree.
The simplest construction algorithm uses a priority queue where the node with lowest
probability is given highest priority:
1. Create a leaf node for each symbol and add it to the priority queue.
2. While there is more than one node in the queue:
i. Remove the two nodes of highest priority (lowest probability) from

the queue
37
ii. Create a new internal node with these two nodes as children and
with probability equal to the sum of the two nodes' probabilities.
iii. Add the new node to the queue.
3. The remaining node is the root node and the tree is complete.
Since efficient priority queue data structures require O(log n) time per insertion, and a
tree with n leaves has 2n−1 nodes, this algorithm operates in O(n log n) time.
If the symbols are sorted by probability, there is a linear-time (O(n)) method to create
a Huffman tree using two queues, the first one containing the initial weights (along with
pointers to the associated leaves), and combined weights (along with pointers to the trees)
being put in the back of the second queue. This assures that the lowest weight is always
kept at the front of one of the two queues:
1. Start with as many leaves as there are symbols.
2. Enqueue all leaf nodes into the first queue (by probability in increasing
order so that the least likely item is in the head of the queue).
3. While there is more than one node in the queues:
i. Dequeue the two nodes with the lowest weight by examining the
fronts of both queues.
ii. Create a new internal node, with the two just-removed nodes as
children (either node can be either child) and the sum of their weights as
the new weight.
iii. Enqueue the new node into the rear of the second queue.
4. The remaining node is the root node; the tree has now been generated
It is generally beneficial to minimize the variance of codeword length. For example, a

communication buffer receiving Huffman-encoded data may need to be larger to deal
with especially long symbols if the tree is especially unbalanced. To minimize variance,
simply break ties between queues by choosing the item in the first queue. This
modification will retain the mathematical optimality of the Huffman coding while both
minimizing variance and minimizing the length of the longest character code.
4.2.2 Main Properties
The probabilities used can be generic ones for the application domain that are
based on average experience, or they can be the actual frequencies found in the text being
compressed. (This variation requires that a frequency table or other hint as to the
38
encoding must be stored with the compressed text; implementations employ various tricks
to store tables efficiently.)
Huffman coding is optimal when the probability of each input symbol is a

negative power of two. Prefix codes tend to have slight inefficiency on small alphabets,
where probabilities often fall between these optimal points. "Blocking", or expanding the
alphabet size by coalescing multiple symbols into "words" of fixed or variable-length
before Huffman coding, usually helps, especially when adjacent symbols are correlated
(as in the case of natural language text). The worst case for Huffman coding can happen
when the probability of a symbol exceeds 2−1 = 0.5, making the upper limit of inefficiency
unbounded. These situations often respond well to a form of blocking called run-length
encoding; for the simple case of Bernoulli processes, Golomb coding is a provably
optimal run-length code.
Arithmetic coding produces slight gains over Huffman coding, but in practice
these gains have seldom been large enough to offset arithmetic coding's higher
computational complexity and patent royalties.
4.2.3 Applications
Arithmetic coding can be viewed as a generalization of Huffman coding; indeed,

in practice arithmetic coding is often preceded by Huffman coding, as it is easier to find
an arithmetic code for a binary input than for a nonbinary input. Also, although arithmetic
coding offers better compression performance than Huffman coding, Huffman coding is
still in wide use because of its simplicity, high speed and lack of encumbrance by patents.
Huffman coding today is often used as a "back-end" to some other compression

method. DEFLATE and multimedia codecs such as JPEG and MP3 have a front-end
model and quantization followed by Huffman coding.
4.3 Arithmetic Coding
Arithmetic coding is a compression technique that encodes data (the data string)
by creating a code string which represents a fractional value on the number line between
0 and 1. The coding algorithm is symbol wise recursive; i.e., it operates upon and encodes
(decodes) one data symbol per iteration or recursion. On each recursion, the algorithm
39
successively partitions an interval of the number line between 0 and 1, and retains one of
the partitions as the new interval. Thus, the algorithm successively deals with smaller
intervals, and the code string, viewed as a magnitude, lies in each of the nested intervals.
The data string is recovered by using magnitude comparisons on the code string to
recreate how the encoder must have successively partitioned and retained each nested
subinterval. Arithmetic coding differs considerably from the more familiar compression
coding techniques, such as prefix (Huffman) codes.
4.3.1 Compression systems
The notion of compression systems captures the idea that data may be transformed
into something which is encoded, then transmitted to a destination, then transformed back
into the original data. Any data compression approach, whether employing arithmetic
coding, Huffman codes, or any other coding technique, has a model which makes some
assumptions about the data and the events encoded.
The code itself can be independent of the model. Some systems which compress
waveforms (eg, digitized speech) may predict the next value and encode the error. In this
model the error and not the actual data is encoded. Typically, at the encoder side of a
compression system, the data to be compressed feed a model unit. The model determines
1) the event to be encoded, and 2) the estimate of the relative frequency (probability) of
the events. The encoder accepts the event and some indication of its relative frequency
and generates the code string.
A simple model is the memoryless model, where the data symbols themselves are
encoded according to a single code. Another model is the first-order Markov model,
which uses the previous symbol as the context for the current symbol. Consider, for
example, compressing English sentences. If the data symbol (in this case, a letter) “q” is
the previous letter, we would expect the next letter to be “u.” The first-order Markov
40
model is a dependent model; we have a different expectation for each symbol (or in the
example, each letter), depending on the context. The context is, in a sense, a state
governed by the past sequence of symbols. The purpose of a context is to provide a
probability distribution, or statistics, for encoding (decoding) the next symbol.
Corresponding to the symbols are statistics. To simplify, consider a single-context model,
i.e., the memoryless model. Data compression results from encoding the more frequent
symbols with short code-string length increases, and encoding the less-frequent events.
with long code length increases.
Most of the data compression methods in common use today fall into one of two
camps: dictionary based schemes and statistical methods. In the world of small systems,
dictionary based data compression techniques seem to be more popular at this time.
However, by combining arithmetic coding with powerful modeling techniques, statistical
methods for data compression can actually achieve better performance.
4.3.2 Arithmetic Coding: how it works
It has only been in the last ten years that a respectable candidate to replace
Huffman coding has been successfully demonstrated: Arithmetic coding. Arithmetic
coding completely bypasses the idea of replacing an input symbol with a specific code.
Instead, it takes a stream of input symbols and replaces it with a single floating point
output number. The longer (and more complex) the message, the more bits are needed in
the output number. It was not until recently that practical methods were found to
implement this on computers with fixed sized registers. The output from an arithmetic
coding process is a single number less than 1 and greater than or equal to 0. This single
Character Probability
---------
number can be uniquely decoded to create the --------
exact stream of symbols that went into its
SPACE 1/10
A 1/10
construction. In order to construct the output number, the symbols being encoded have to
B 1/10
E to them. For example,
have a set probabilities assigned 1/10 if we are going to encode the
G 1/10
41
I 1/10
L 2/10
S 1/10
T 1/10
random message "BILL GATES" we would have a probability distribution that looks like
this:
Once the character probabilities are known, the individual symbols need to be
assigned a range along a "probability line", which is nominally 0 to 1. It does not matter
which characters are assigned which segment of the range, as long as it is done in the
same manner by both the encoder and the decoder. The nine character symbol set use here
would look like this:
Character Probability Range

--------- ----------- -----------
SPACE 1/10 0.00 - 0.10
A 1/10 0.10 - 0.20
B 1/10 0.20 - 0.30
E 1/10 0.30 - 0.40
G 1/10 0.40 - 0.50
I 1/10 0.50 - 0.60
L 2/10 0.60 - 0.80
S 1/10 0.80 - 0.90
T 1/10 0.90 - 1.00
Each character is assigned the portion of the 0-1 range that corresponds to its
probability of appearance. Note also that the character "owns" everything up to, but not
including the higher number. So the letter 'T' in fact has the range 0.90 - 0.9999....
The most significant portion of an arithmetic coded message belongs to the first
symbol to be encoded. When encoding the message "BILL GATES", the first symbol is
"B". In order for the first character to be decoded properly, the final coded message has to
42
be a number greater than or equal to 0.20 and less than 0.30. What we do to encode this
number is keep track of the range that this number could fall in. So after the first character
is encoded, the low end for this range is 0.20 and the high end of the range is 0.30.
After the first character is encoded, we know that our range for the output number
is now bounded by the low number and the high number. What happens during the rest of
the encoding process is that each new symbol to be encoded will further restrict the
possible range of the output number. The next character to be encoded, 'I', owns the range
0.50 through 0.60. If it was the first number in the message, we would set low and high
range values directly to those values. But 'I' is the second character. So what we do
instead is say that 'I' owns the range that corresponds to 0.50-0.60 in the new subrange of
0.2 - 0.3. This means that the new encoded number will have to fall somewhere in the
50th to 60th percentile of the currently established range. Applying this logic will further
restrict our number to the range 0.25 to 0.26. The algorithm to accomplish this for a
message of any length is shown below:
Set low to 0.0
Set high to 1.0
While there are still input symbols do
get an input symbol
code range = high - low.
high = low + range*high range(symbol)
low = low + range*low range(symbol)
End of While
output low
Following this process through to its natural conclusion with our chosen message looks
like this:
43
New Character Low value High Value

------------- --------- ----------
0.0 1.0
B 0.2 0.3
I 0.25 0.26
L 0.256 0.258
L 0.2572 0.2576
SPACE 0.25720 0.25724
G 0.257216 0.257220
A 0.2572164 0.2572168
T 0.25721676 0.2572168
E 0.257216772 0.257216776
S 0.2572167752 0.2572167756
So the final low value, 0.2572167752 will uniquely encode the message "BILL
GATES" using the present encoding scheme.
Given this encoding scheme, it is relatively easy to see how the decoding process
will operate. We find the first symbol in the message by seeing which symbol owns the
code space that the encoded message falls in. Since the number 0.2572167752 falls
between 0.2 and 0.3, we know that the first character must be "B". We then need to
remove the "B" from the encoded number. Since we know the low and high ranges of B,
we can remove their effects by reversing the process that put them in. First, we subtract
the low value of B from the number, giving 0.0572167752. Then we divide by the range
of B, which is 0.1. This gives a value of 0.572167752. We can then calculate where that
lands, which is in the range of the next letter, "I".
The algorithm for decoding the incoming number looks like this:
get encoded number
Do
find symbol whose range straddles the encoded number
output the symbol
range = symbol low value - symbol high value
44
subtract symbol low value from encoded number
divide encoded number by range
until no more symbols
Note that we have conveniently ignored the problem of how to decide when there
are no more symbols left to decode. This can be handled by either encoding a special
EOF symbol, or carrying the stream length along with the encoded message.
The decoding algorithm for the "BILL GATES" message will proceed something like
this:
Encoded Number Output Symbol Low High Range

-------------- ------------- --- ---- -----
0.2572167752 B 0.2 0.3 0.1
0.572167752 I 0.5 0.6 0.1
0.72167752 L 0.6 0.8 0.2
0.6083876 L 0.6 0.8 0.2
0.041938 SPACE 0.0 0.1 0.1
0.41938 G 0.4 0.5 0.1
0.1938 A 0.2 0.3 0.1
0.938 T 0.9 1.0 0.1
0.38 E 0.3 0.4 0.1
0.8 S 0.8 0.9 0.1
0.0
In summary, the encoding process is simply one of narrowing the range of
possible numbers with every new symbol. The new range is proportional to the
predefined probability attached to that symbol. Decoding is the inverse procedure, where
the range is expanded in proportion to the probability of each symbol as it is extracted.
45
Chapter 5
SOFTWARE IMPLEMENTATION
5.1 Introducing Mex-Files
We can call C or Fortran subroutines from MATLAB as if they were built-in
functions. MATLAB callable C and Fortran programs are referred to as MEX-files.
MEX-files are dynamically linked subroutines that the MATLAB interpreter can
automatically load and execute.
MEX-files have several applications:
• Large pre-existing C and Fortran programs can be called from MATLAB without
having to be rewritten as M-files.
• Bottleneck computations (usually for-loops) that do not run fast enough in
MATLAB can be recoded in C or Fortran for efficiency.
5.1.1 Using Mex-Files
46
MEX-files are subroutines produced from C or Fortran source code. They behave
just like M-files and built-in functions. While M-files have a platform-independent
extension, .m, MATLAB identifies MEX-files by platform-specific extensions.
We can call MEX-files exactly as we would call any M-function. For example, a
MEX-file called conv2.mex on your disk in the MATLAB datafun toolbox directory
performs a 2-D convolution of matrices. conv2.m only contains the help text
documentation. If we invoke the function conv2 from inside MATLAB, the interpreter
looks through the list of directories on MATLAB’s search path. It scans each directory
looking for the first occurrence of a file named conv2 with the corresponding filename
extension or .m. When it finds one, it loads the file and executes it. MEX-files take
precedence over M-files when like-named files exist in the same directory. However, help
>>documentation
text mex -setup is still read from the .m file.
Please choose your compiler for building external interface (MEX) files:
5.1.2 Running Mex Files
Would you like mex to locate installed compilers [y]/n? y
SelectThe
a compiler:
following commands are followed to compile mex files used in a matlab
[1] Digital Visual Fortran version 6.0 in C:\Program Files\Microsoft Visual Studio
code.
[2] Lcc C version 2.4 in C:\MATLAB7\sys\lcc
[3] Microsoft Visual C/C++ version 6.0 in C:\Program Files\Microsoft Visual Studio
[0] None
Compiler: 2
Please verify your choices:
Compiler: Lcc C 2.4
Location: C:\MATLAB7\sys\lcc
Are these correct?([y]/n): y
Try to update options file: C:\Documents and Settings\Administrator\Application

Data\MathWorks\MATLAB\R14\mexopts.bat
From template: C:\MATLAB7\BIN\WIN32\mexopts\lccopts.bat

47
Done . . .
Then to run a mex file, the file name is preceded by the keyword mex and
appropriate extention is used. For example, to run the mex file dominant_pass_c we use
the following command:
>> mex dominant_pass_c.c
5.2 Flowchart For Compression:
48
5.3 Embedded Zero-Tree Wavelet :
49
5.4 Dominant Pass :
50
5.5 Subordinate Pass :
51
5.6 Huffman Coding :
52
53
5.6 Arithmetic Coding :
54
5.8 Decompression :
55
Chapter 6
RESULT
In this project we have compressed an image using the JPEG2000 Compression
technique. This process includes obtaining the wavelet coefficients of the image and then
encoding the resulting co-efficient using EZW algorithm, we then Huffman and
arithmetic encode the resulting sequence.
The compression ratios obtained when given a threshold of 2048 are:-
Huffman coding - 7281.8
Arithmetic coding - 5041.2
The Figure below compares the uncompressed original image and the compressed
output image obtained after execution of the matlab code.
(a) (b)
Figure 6.1: (a) Original Image (b) Compressed Image
56
Chapter 7
CONCLUSION
Transform coding forms an integral part of compression techniques. The
advantage of using a transform is that it packs the data into a lesser number of
coefficients. The main purpose of using the transform is thus to achieve energy
compaction. The type of transform used varies between different codecs.
Discrete Cosine Transform (DCT) is a commonly used in different compression
techniques. DCT closely resembles the optimal transform in terms of performance. The
DCT based compression techniques are efficient. However, it has its own shortcomings.
While good compression is achieved by this method, it does not achieve high
compression with lesser distortions. In order to achieve this Wavelet transforms are used
in latest standards.
In this project, an Embedded Zero-tree Wavelet Encoder was developed for JPEG
2000 image and its performance is studied. It can be seen that EZW has a marginally
better performance than the JPEG like encoder. The EZW scheme is marginally better
than the DCT scheme.
57
Chapter 8
FUTURE SCOPE
It can be seen that for very small block sizes and very large block sizes the visual
distortions are more pronounced. In case of smaller images the distortions may be due to
the size of the header details needed to be added for each block, which would be
significant compared to the information of the block itself. In case of large block size the
initial threshold is high and hence needs a lot of passes to achieve significant amount of
visual quality.
The embedded block coding algorithm achieves excellent compression
performance, usually higher than that of EZW with arithmetic coding, but in some cases
substantially higher. The algorithm utilizes the same low complexity binary arithmetic
coding engine. Together with careful design of the bit-plane coding primitives, this
enables comparable execution speed to that observed with the simpler EZW without
arithmetic coding. The coder offers additional advantages including memory locality,
spatial random access and ease of geometric manipulation.
58
BIBLIOGRAPHY
[1] Jonathan E. Turner, “A Beginner’s Guide to PACS”, 2002
[2] Michael J. Gormish, Daniel Lee, Michael W. Marcellin, “JPEG 2000:
OVERVIEW, ARCHITECTURE, AND APPLICATIONS”, 2002.
[3] M. Boliek, S. Houchin, G. Wu, “JPEG 2000 Next Generation Image Compression
System: Features and Syntax,” Proceedings ICIP-2000, Sept. 2000.
[4] Lecture Notes on Wavelet Transforms, Ambikairajah E., (University of New
South Wales, 2000).
[5] G. Kaiser, A Friendly Guide to Wavelets, Birkhauser, Boston, 1994.
[6] Arno Swart, “An introduction to JPEG2000-like compression using MATLAB”,
28th October 2003.
[7] Balaji Vasan Srinivasan, “Image Compression Using Embedded Zero-tree
WaveletCoding (EZW)”, 2003.
[8] Thomas Stahlbuhk, Hourieh Fakourfar, “The Embedded Zerotree Wavelet
Algorithm”, 2002.
[9] D.VIJENDRA BABU, Dr.N.R.ALAMELU, “Wavelet Based Medical Image
Compression Using ROI EZW”, International Journal of Recent Trends in
Engineering, Vol 1, No. 3, May 2009.
59
60

Report

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Report

Uploaded by

Copyright:

Available Formats

IMPLEMENTATION OF MINI-PACS IN HEALTH CARE SYSTEMS

entire project work.

real-time collaboration among physicians and eminently enhances productivity and

1.2 Medical Images

output device. To present such images appropriately sophisticated image browsing

techniques has to be developed. Digital images are described as bitmaps formed of

images can be decompressed.

Image Size Bits/ Uncompressed Transmission Transmission

Type Pixel Size Bandwidth Time(Aprx.)

Gray 512 x 512 8bpp 262Kb 2.1 Mb/image 1min 13s

Color 512 x 512 24bpp 786Kb 6.29 Mb/image 3min 39s

Medical 2048 x 2048 12bpp 5.16 Mb 41.3 Mb/image 23min 54s

international standard represents advances in image compression technology where the

interoperability in network and mobile environments. Digital imaging has become an

In Medical Imaging, PACS (Picture Archiving and Communication System) is an

retrieval, and display of radiographic images.

The key components of PACS are modality interfaces, a network backbone, a

database management system, an image management system, a long-term archive, and

allowing Internet access, is also a strategic component of PACS.

Digital Imaging and Communications in Medicine (DICOM) is a comprehensive set of

standards for handling, storing and transmitting information in medical imaging. It

includes a file format definition and a network communication protocol.

archives. It expands on the possibilities of such conventional systems by providing

capabilities of off-site viewing and reporting (tele-education, tele-diagnosis).

Additionally, it enables practitioners on various physical locations to peruse the same

information simultaneously. With the ever-decreasing price of digital storage, PACS

systems are overwhelmingly cost-effective.

1.4 Problem Formulation

examinations of conventional radiology continue representing 70% of the examinations

particularly in the same color and motion is needed in diagnostic images.

In pursuance of the above goals, we have decided to implement a system, in which

mini-PACS in Healthcare Systems”.

1.5 Scope of the Project

Medical imaging is important and widespread in the diagnosis of disease. In

manufactured by Siemens Medical Systems. Further, images from different imaging

practice, viewing of medical images typically requires a different proprietary

"workstation" for each manufacturer and for each modality.

In principle, medical images could be converted to Internet web pages for

widespread viewing. Several technical limitations of current Internet standards, however,

presented with appropriate brightness and contrast. On proprietary workstations these

PICTURE ARCHIVING AND COMMUNICATION

2.1 Introduction to PACS

PACS (Picture Archiving and Communication Systems) are high-speed, graphical,

Picture - Digital diagnostic image (radiological)

Archiving - Electronic storage & retrieval

Communication - Computer network (multiple access)

Systems - Control of the processes (integrated technology)

Figure 2.1.1: PACS system

introduction of client/server computing, improved digital imaging and computer network

diagnostic capabilities at radiological examinations.

seeking to implement seeing to implement some form of a digital image management

at a minimum hospital wide, mini-PACS usually tend to be departmental-based

2.2 Types of PACS

Full PACS handle images from various modalities, such as ultrasonography,

radiography, magnetic resonance imaging, positron emission tomography, computed

CT MRI Ultrasound Laser Scanner

Figure 2.2.1: The Entire PACS System

The key components of PACS are modality interfaces, a network backbone, a

database management system, an image management system, a long-term archive, and

allowing Internet access, is also a strategic component of PACS.

Typically a PACS network consists of a central server which stores a database

admission/discharge/transfer messages about patients. Finally, the PACS receive

zoom, brightness, contrast and others).