You are on page 1of 71

LESION DETECTION AND CLASSIFICATION OF

MAMMOGRAM BASED ON ADAPTIVE


THRESHOLD AND DISCRIMINANT ANALYSIS

A thesis submitted in partial fulfillment of the requirements for

The award of the degree of

M.Tech.

in

COMMUNICATION SYSTEMS

By

PUJARI SUJAY GIRISH

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING


NATIONAL INSTITUTE OF TECHNOLOGY
TIRUCHIRAPALLI – 620 015.

DECEMBER 2010
BONAFIDE CERTIFICATE

This is to certify that the project titled “LESION DETECTION AND


CLASSIFICATION OF MAMMOGRAM BASED ON ADAPTIVE
THRESHOLD AND DISCRIMINANT ANALYSIS” is a bonafide record of the
work done by

PUJARI SUJAY GIRISH (208109013)

in partial fulfillment of the requirements for the award of the degree of Master of
Technology in Communication Systems of the NATIONAL INSTITUTE OF
TECHNOLOGY, TIRUCHIRAPPALLI, during the year 2010-2011.

S. DEIVALAKSHMI

Guide Head of the Department

Project Viva-voce held on _____________________________

Internal Examiner External Examiner


ABSTRACT

Early detection of breast cancer increases the survival rate and increases the
treatment options. One of the most powerful techniques for early detection of breast
cancer is based on digital mammogram. In order to detect the breast cancer, the
Radiologist usually searches the mammograms visually for specific abnormalities.
However, visual analysis of mammograms is difficult task for radiologists. Computer
Aided Diagnosis (CAD) technology helps in identifying and assists the radiologists to
make final decision.

The proposed CADx system involves three major steps called Lesion Detection,
Feature extraction and Classification.

In Lesion Detection, after pre-processing the digital mammogram using adaptive


threshold based on Multiresolution analysis, the suspicious Region of Interest (ROI) is
detected. Shape based features are considered as defining or measuring using simple
formula. Such Thirteen shape based features are extracted from ROI and also from all
Four channels obtained after wavelet decomposition of ROI. This gives us five feature
sets.

Malignant and benign masses are abnormal/tumour cells present in the breast. While
malignant are treated as cancerous tumours and benign are non-cancerous. Now
Classification to judge whether Benign or Malignant using Canonical Discriminant
analysis for all five feature sets is performed and their classification rates are
compared. In short, five classification schemes are discussed.

The proposed method can allow the radiologist to focus rapidly on the relevant parts
of the mammogram and it can increase the effectiveness and efficiency of radiology
clinics.

Keywords: adaptive threshold based on Multiresolution analysis, CADx, Discriminant


analysis, Shape features,
ACKNOWLEDGEMENTS

I take this opportunity to express my sincere thanks and deep sense of


gratitude to my project guide Ms S. Deivalakshmi, Assistant Professor, Department
of Electronics and Communication Engineering, National Institute of Technology,
Tiruchirappalli for her guidance, needy suggestions, moral support, constant
encouragement and kind co-operation.

My sincere and heartfelt thanks to Dr. B.Venkataramani, Professor & Head


of the Department, Electronics and Communication Engineering, National Institute
of Technology, Tiruchirappalli, for his indispensable help throughout the course of
this project work.

Finally I would like to thank to all teaching staff and my classmates and
computer support group staff, for their sincere help, without whom I am unable to
complete this project
TABLE OF CONTENTS

Title Page No

ABSTRACT………………………………………………….... i

ACKNOWLEDGEMENTS…………………………………... ii

TABLE OF CONTENTS……………………………………... iii

LIST OF FIGURES………………………………………….... v

LIST OF TABLES…………………………………………….. ix

ABBREVIATIONS……………………………………………. x

CHAPTER 1 INTRODUCTION

1.1 Motivation…………………………………………………………........... 1
1.2 Objectives and Approach…………………………………………………. 3
1.3 Study Outline …………………………………………………………. 3

CHAPTER 2 LITERATURE REVIEW 4

CHAPTER 3 LESION DETECTION

3.1 Pre-Processing……………………………………………………………. 9
3.2 Lesion Detection………………………………………………………….. 15
3.3 Region selection…………………………………………………………… 19
CHAPTER 4 FEATURE EXTRACTIONS & CLASSIFICATION
4.1 Feature extraction………………………………………………………….. 21
4.2 Feature classification………………………………………………………. 24
CHAPTER 5 RESULTS AND DISCUSSION

5.1 Lesion detection……………………………………………………………. 29

5.2 Feature extraction and classification………………………………………. 37

CHAPTER 6 CADx USER INTERFACE 53


CHAPTER 7 CONCLUSION & FURTHER WORK 55

REFERENCES 57
LIST OF FIGURES

Figure No Title Page


No

1.1 Structure of Breast……………………………………………………2


1.2 CC view and MLO view……………………………………………...2
1.3 Two basic views of mammographic image: (a) CC view,
b) MLO view…………………………………………………………2
3.1 Main steps involved in the computer aided detection…………….....7
3.2 The proposed method for lesion detection……………………………8
3.3 steps involved in Pre-processing……………………………………...9
3.4 images with salt and pepper noise and after passing through Median
filter…………………………………………………………………...9

3.5 original image and BW1,BW2 (Binary versions with different


Threshold)…………………………………………………………...10

3.6 Skin line of given mammogram……………………………………..10

3.7 Skin Line and RMLO images ……………………………………….11

3.8 LMLO and skin line images………………………………………...12

3.9 Masking to remove background-Mask1 & after masking…………..13

3.10 Before removing rib & after removing rib ………………………….13

3.11 Rib portion (RIB Part) , after removing Rib (Pre-processed


mammogram) and Given mammogram…………………………….14

3.12 Preprocessed image and Segmented Portion (Lesion)………………15

3.13 Block diagram for adapt. segmentation method adapted by Zhang…16

3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate

P (Cb )pb(x) and P(Ct)pt (x), respectively...………………………….17

3.15 Bayes threshold λ1 and the proposed candidate threshold λ3


are indicated…………………………………………………………18
3.16 Implemented algorithm for Segmentation Part…………………….19

3.17 Segmented Portion and Lesion Part (Selected Region)…………….20

4.1 Abstract Flow of Proposed method………………………………….21

4.2 One stage, 2d- DWT Decomposition with db4 family……………...22

4.3 Steps to determine CDF for all four feature sets extracted from
respective channels………………………………………………….26

4.4 Steps to determine CDF for feature set 1extracted from Lesion……27

4.5 Steps involved for classification of given Lesion by determining


Discriminant score (N) - Validation process………………………..27

5.1 (MDB063) First stage DWT decomposition of Pre-processed image


with db2 family……………………………………………………...29

5.2 Adaptively chosen Threshold for 27 malignant cases……………...30

5.3 Adaptively chosen Threshold for 38 benign cases…………………30

5.4 Adaptively chosen Threshold for 35 Normal cases………………...30

5.5 (MDB135) original mammogram with its skin line………………..31

5.6 (MDB135) Derivative plot of histogram of LL1 channel of given


mammogram-Threshold 1=206……………………………………..32

5.7 (MDB135) Histogram of given mammogram-Threshold 2=190…...32

5.8 (MDB 135) Region selections………………………………………32

5.9 (MDB226) original mammogram with its skin line………………..33

5.10 (MDB226) Derivative plot of histogram of LL1 channel of given


.mammogram-Threshold 1=224…………………………………….33

5.11 (MDB226) Histogram of given mammogram-Threshold 2= 189…..34

5.12 (MDB 226) Region selections………………………………………34

5.13 (MDB115) original mammogram with its skin line………………...35


5.14 (MDB115) Derivative plot of histogram of LL1 channel of given
mammogram-Threshold 1=231……………………………………..35

5.15 (MDB115) Histogram of given mammogram-Threshold 2=204……36

5.16 (MDB115) Region selections……………………………………….36

5.17 Lesion part detected from training dataset of 20 benign

Mammograms ……………………………………………………….38

5.18 Lesion part detected from training dataset of 20 malignant


mammograms ……………………………………………………….39

5.19 Lesion and its one stage level-1 DWT decomposition for
mdb002 using db4…………………………………………………...40

5.20 Lesion and its one stage level-1 DWT decomposition for mdb028
using db4…………………………………………………………….41

5.21 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-1………………………………………………………….42

5.22 Classification result for validation dataset (unknown


mammograms) (N=1)……………………………………………….44

5.23 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-2………………………………………………………….46

5.24 Classification result for validation dataset (unknown


mammograms) (N=2)……………………………………………….46

5.25 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-3………………………………………………………….48

5.26 Classification result for validation dataset (unknown


mammograms) (N=3)……………………………………………….48

5.27 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-4………………………………………………………….50
5.28 Classification result for validation dataset (unknown
mammograms) (N=4)………………………………………………50

5.29 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-5………………………………………………………….52

5.30 Classification result for validation dataset (unknown


mammograms) (N=5)……………………………………………….52

6.1 GUI interface for CADx (Trained)………………………………………………………………54

7.1 Comparison chart for classification of Lesion using DA on Feature

set 1 to 5………………………………………………………………………………………………….55
LIST OF TABLES

Table no Title Page no

4.1 13 shape based features ……………………………………………23

5.1 mdb002 extracted features (Benign group)…………………………40

5.2 mdb028 extracted features (Malignant group)…………………….41

5.3 Unstandardized coefficients for Lesion……………………………..42

5.4 Classification Results (N=1)………………………………………..43

5.5 Unstandardized coefficients for Lesion_Ca………………………...45

5.6 Classification Results (N=2)………………………………………..45

5.7 Unstandardized coefficients for Lesion_Chd……………………….47

5.8 Classification Results (N=3)………………………………………47

5.9 Unstandardized coefficients for Lesion_Cvd……………………...49

5.10 Classification Results (N=4)………………………………………49

5.11 Unstandardized coefficients for Lesion_Cdd……………………..51

5.12 Classification Results (N=5)………………………………………51

5.13 summaries of all results for classification…………………………52


ABBREVIATIONS

CAD Computer-Aided Detection

CADx Computer-Aided Diagnosis

CC Craniocaudal

CDF Canonical Discriminant functiom

DA Discriminant Analysis

DS Discriminant score

DWT Discrete Wavelet Transform

FNR False Negative Rate

FPR False Positive Rate

MIAS Mammographic Image Analysis Society

MLO Medio Lateral Oblique

PDF Probability Density Function

ROI Region of Interest

TNR True Negative Rate

TPR True Positive Rate


CHAPTER-1
INTRODUCTION

Breast cancer is the second leading cause of cancer death in women today (after
lung cancer). An estimated 40,230 breast cancer deaths are expected in 2010. .According to
National Cancer Institute, one out of eight women will develop breast cancer during her
lifetime.
Breast cancer stages range from stage 0 (very early form of cancer) to state IV
(advanced, metastatic breast cancer).Early stage breast cancer are associated with high
survival rates than late stage concerns.

The key to surviving breast cancer is early detection and treatment. According to
ACS, when breast cancer is confined to the breast, the five-year survival rate is almost 100%.
Breast cancer screening has been shown to reduce breast cancer mortality. The high survival
rates of early detection of breast cancer can be attributed to utilization of mammography
screening as well as high level of awareness of the disease symptoms in the population.

Beginning in their early 20s, women should be told about the benefits and
limitations of breast self-examination (BSE).For women in their 20s and 30s, it is
recommended that clinical breast examination (CBE) be part of a periodic health
examination, preferably at least every three years. Asymptomatic women aged 40 and over
should continue to receive a clinical breast examination as part of a periodic health
examination, preferably annually and prior to mammography and to begin annual
mammography at age 40.

1.1 Motivation

Mammography is a uniquely important type of medical imaging used to screen for


breast cancer. All women at risk go through mammography screening procedures for early
detection and diagnosis of tumour. Special x-ray machines developed exclusively for breast
imaging are used to produce mammography films. These machines use very low doses of
radiation and produce high-quality x-rays. A typical mammogram is an intensity x-ray image
with gray levels showing levels of contrast inside the breast which characterize normal tissue
and different calcification and masses.
Each breast is imaged separately in craniocaudal (CC) view and mediolateral-oblique (MLO)
view shown in Figure 1.1(a) and Figure 1.1(b), respectively.

Fig. 1.1 Structure of Breast Fig 1.2 CC view & MLO view.

Fig. 1.3 Two basic views of mammographic image: (a) CC view, (b )MLO view.

Computer-aided detection (CAD) and computer-aided diagnosis (CADx) systems can


improve the results of mammography screening programs and decrease number of false
positive cases. CAD systems use computerized algorithms for identifying suspicious ROIs. The
motivation behind CAD systems is to reduce both the False Positive Rate (FPR) and False
Negative Rate (FNR). When used as intended, CAD would be expected to increase the
number of mammograms interpreted as positive to the extent that it points out
abnormalities previously overlooked by the radiologist. On the other hand, the cost of
missed or undetected abnormalities (FNs) is very high.
1.2 Objectives and Approach

The ultimate aim of the CADx system is to help the radiologist in making
recommendations for patient management.
CAD systems consist primarily of the following processing stages such as Pre-
processing, Segmentation, Feature extraction and classification.Pre-processing is performed
to reduce and suppress noise , to enhance mammogram and to remove Background Region
in MLO view of mammogram. Segmentation is nothing but Lesion detection is performed
using adaptive threshold technique . Now from detected ROIs of respective mammograms
shape features are extracted and processed towards classification of abnormality.
The database consists of 40 mammograms with 20 Benign and 20 Malignant cases.
Five feature sets were extracted from ROIs and provided as input to classification stage
using DA. Classification results were compared. After knowing Canonical Discriminant
functions for all five feature sets Analysis is performed on database of 65 unknown
mammograms as validation process and the algorithem for CADx is finalized based on most
significant feature set.

1.3 Study Outline

This project thesis is organized as follows. Chapter 2 reviews the literature and
background of breast cancer and CAD systems in mammography. The Materials and
Methods used in this study are discussed in Chapter 3 and Chapter 4. Chapter 3 deals with
first part of lesion detection and Chapter 4 about Feature extraction and classification
towards classifying given Lesion. Chapter 5provides the results and discussion and the
Chapter 6 concludes thesis with future direction.

CHAPTER-2
LITERATURE REVIEW
In this chapter, important literatures on the CADx system and their algorithms in
mammography are reviewed. Along with that The Literature required for proposed CADx
system are reviewed.

William R. Klecka (1980), in “Discriminant Analysis” presents a lucid and simple


introduction to several related statistical procedures known as discriminant analysis.
Discriminant Analysis (DA) introduces canonical discriminant function (CDF) of variables in
discriminant analysis. Professor Klecka derives canonical discriminant function coefficients,
provides spatial interpretation of them, and provides a nice discussion of the interpretation
of CDFs. He presents clear discussion of unstandardized and standardized

SPSS ver. 14 manual on algorithms titled “Discriminant” explains all steps involved toward
Classification based on CDF coefficients.

Ingrid Daubechies (1987) invented first smooth orthogonal wavelet with compact support
now known as db-N family. In her text “Ten Lectures on Wavelets” she explains theory
behind wavelets and give nice tour to wavelet era.

Olivier Rioul (1993) described Multiresolution analysis and synthesis for discrete time
signals, in “A Discrete-Time Multiresolution Theory”. Concepts of scale and resolution are
first reviewed in discrete time. The resulting framework allows one to treat the discrete
wavelet transform, octave-band perfect reconstruction filter banks, and pyramid transforms
from a unified standpoint.

Xiao-Ping Zhang and Mita D. Desai (2001) has suggested a general systematic method for
the detection and segmentation of bright targets, in “Segmentation of Bright Targets Using
Wavelets and Adaptive Thresholding”. A method is developed which adaptively chooses
thresholds to segment targets from background, by using a multiscale analysis of the image
probability density function (PDF). A performance analysis based on a Gaussian distribution
model is used to show that the obtained adaptive threshold is often close to the Bayes
threshold. The method has proven robust even when the image distribution is unknown.
Examples are presented to demonstrate the efficiency of the technique on a variety of
targets.
H.D. Cheng et.al (2003) surveyed most important part of CADx algorithm in “Computer-
aided detection and classification of Micro calcifications in mammograms: a survey”. In
that paper they summarized and compare the methods used in various stages of the
computer-aided detection systems (CAD). In particular, the enhancement and segmentation
algorithms, mammographic features, classifiers and their performances are studied and
compared. Remaining challenges and future research directions are also discussed.

Gonzalez R. et.al (2004) discussed detail discussion regarding shape and margin features in
chapter 11 of the text “Digital image processing using MATLAB”.

Alfonso Rojas Dominguez & Asoke K. Nandi (2008) presented a method for automatic
detection of mammographic masses, in “Detection of masses in mammograms via
statistically based enhancement, multilevel-thresholding segmentation, and region
selection”. As part of this method, an enhancement algorithm that improves image contrast
based on local statistical measures of the mammograms is proposed. After enhancement,
regions are segmented via thresholding at multiple levels, and a set of features is computed
from each of the segmented regions. For feature extraction he used shape and margin based
properties

Jelena Bozek et.al (2009) surveyed Algorithms, in “A Survey of Image Processing Algorithms
in Digital Mammography”. This chapter gives a survey of image processing algorithms that
have been developed for detection of masses and calcifications. An overview of algorithms
in each step (segmentation step, feature extraction step, feature selection step,
classification step) of the mass detection algorithms is given. Wavelet detection methods
and other recently proposed methods for calcification detection are presented. An overview
of contrast enhancement and noise equalization methods is given as well as an overview of
calcification classification algorithms.

B. Surendiran et.al (2009) performed Discriminant Analysis for classifying the masses
present in mammogram, in “Classifying Digital Mammogram Masses using Univariate
ANOVA Discriminant Analysis”. This approach combines the19 shape properties of the mass
regions and classifies the masses as benign or malignant using Univariate ANOVA. The DDSM
database along with ground truth details are used for experiment. According to which,
Malignant and benign masses are abnormal/tumour cells present in the breast. While
malignant are treated as cancerous tumours and benign are non-cancerous.
Kai Hu et.al (2010) proposed novel algorithm towards Lesion detection, in that work they
proposed combination of two thresholding segmentations; in “Detection of Suspicious
Lesions by Adaptive Thresholding Based on Multiresolution Analysis in Mammograms”;
i.e., a coarse segmentation and a fine segmentation, to segment suspicious lesions in
multiscale images First use the coarse segmentation to get a rough representation of the
localization of suspicious lesions and then use the fine segmentation to improve the rough
representation to generate more precise segmentation results. This algorithm avoids the
deficiencies of the histogram-based and the window based thresholding algorithms and
improves the segmentation accuracy effectively.

CHAPTER-3
LESION DETECTION
CAD system consists of a few typical steps depicted in Fig. 3.1. The screen film
mammographic images need to be digitized prior the image processing. This is one of the
advantages of digital mammography where the image can be directly processed

Pre-processing

Segmentation

Feature extraction

Feature selection

Classification

Fig. 3.1 Main steps involved in the computer aided detection.


The first step in image processing is the pre-processing step. It has to be done on
digitized images to reduce the noise and improve the quality of the image. Most digital
mammographic images are high quality images. Another part of the pre-processing step is
removing the background area and removing the pectoral muscle from the breast area if the
image is a MLO view.
The segmentation step aims to find suspicious regions of interest (ROIs) containing
abnormalities. In the feature extraction step the features are calculated from the
characteristics of the region of interest. Critical issue in algorithm design is the feature
selection step where the best set of features are selected for eliminating false positives and
for classifying lesion types. Feature selection is defined as selecting a smaller feature subset
that leads to the largest value of some classifier performance function.

Pre-processing
algorithms

Adaptive Threshold segmentation


algorithm
Fig. 3.2 The proposed method for Lesion Detection

In this chapter we shall discuss all algorithms undergone towards Lesion Detection.
After getting Region portion user need to decide whether to decide mammogram as normal
or to send it for further classification. Key point is that for normal mammogram after Lesion
detection either only Black image will appear (means all zeros) or may contain noise or
region from background region.

3.1 Pre-Processing

To remove noise

Skin line detection

To remove background

To remove rib portion


Fig. 3.3 Steps involved in Pre-processing

3.1.1 Median filter

Median filter is non-linear filter and is efficient in removing salt-and pepper noise. Median
tends to preserve the sharpness of image edges while removing noise. It is found that the
noise is removed effectively as the size of the window increases. Also, ability to supress
noise only at the expense of blurring of edges

The Median Filter block replaces the central value of an M-by-N neighbourhood with its
median value.

Fig. 3.4 images with salt and pepper noise and after passing through Median filter

3.1.2 Skin Line detection

Fig. 3.5 original image and BW1,BW2 (Binary versions with different Threshold)
Algorithm:

If input image is I,

Let, pixel value of I(Xi,Yi)=Pi

BW1(Xi,Yi)=BP1i And BW2(Xi,Yi)=BP2i

Then,

BP1i=1 for Pi >3;

BP1i=0 for Pi <3

BP2i=1 for Pi >12

BP2i=0 for Pi >12

Skin line=BW1-BW2

Fig 3.6 Skin line of given mammogram

3.1.3 MLO Type: Left/Right?

Now after detecting skin line it is necessary to detect type of MLO view; whether its Left
sided(LMO) or Right sided( RMLO).

Step 1:

Input image will undergo through both RMLO and LMLO test and we will get 2 images,

I1=‘RMLO’

I2= ‘LMLO’

RMLO Test: Algorithm:


Step 1: start with row1.

Step 2: scan from left most column1 towards right side.

Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,

Else when pixel is white then go to Step4.

Step 4: Move to next pixel.

If its white Repeat step 4, else Step 5.

Step 5: replace current pixel with black and move to next pixel.

If we exceed last column go to step 6, else repeat step 5

Step 6: Repeat step 2 to 4 for next row unless you exceed all rows

This Image formed is “RMLO”.

Fig.3.7 Skin Line and RMLO images

LMLO Test: Algorithm:

Step 1: start with row1.

Step 2: scan from right most column1 towards left side (here next pixel means Left one).

Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,

Else when pixel is white then go to Step4.

Step 4: Move to next pixel.

If its white Repeat step 4, else Step 5.

Step 5: replace current pixel with black and move to next pixel.

If we exceed last column go to step 6, else repeat step 5


Step 6: Repeat step 2 to 4 for next row unless you exceed all rows

This Image formed is “LMLO”.

Fig.3.8 LMLO and skin line images

Step 2:

Mean_right=mean (RMLO) %right side view

Mean_Left=mean (LMLO) %left side view

If Mean_Left> Mean_right then mammogram is LMLO & (Mask1=LMLO),View=Left

Else given mammogram is RMLO & (Mask1=RMLO),View=Right

3.1.4 To remove background portion

To remove background portion, apply Mask1 obtained after knowing type of MLO.

Do element wise multiplication with original mammogram to obtain background free


mammogram.

Fig 3.9 Masking to remove background-Mask1 & after masking mammogram (Image2)
3.1.4 To remove rib portion

Fig 3.10: After Local threshold for rib bone removing (Image 3) & after removing rib (Image4)

Algorithm:

Step 1

Image2 apply local threshold i.e. convert it to binary with Threshold =173. (Fig
3.10:Image3)Do not consider first 200 rows, for Label Removing.

Step 2

As we know View is Left or Right, Scan from right or left direction for respective cases.

Travel up to first non-zero pixel; we will call it as POLE.

Step 3

Now travel up to First zero pixel and replace all travelled pixels with zero.

Step 4

Perform Step 1 to 3 for all rows but with following rules:

a) For every POLE ,it should not exceed no. of pixel travelled by previous row

b) Now if at all Rule a is violating for consecutive 5 times then by keeping 45o in
mind decrease pole position for next row by 1 and replace all pixels with zero up
to calculated POLE

After performing these steps you will get image as shown in Fig. 3.8
Step 5

RIB Part =Image3-Image4;

Step 6

Pre-processed image=Image2-RIB Part

Fig. 3.11 Rib portion (RIB Part) , after removing Rib (Pre-processed mammogram) and Given
mammogram

3.2 Lesion Detection


Fig. 3.12 Preprocessed image and Segmented Portion (Lesion)

3.2.1 Segmentation
The aim of the segmentation is to extract ROIs containing all masses and locate the
suspicious mass candidates from the ROI. Segmentation of the suspicious regions on a
mammographic image is designed to have a very high sensitivity and a large number of false
positives are acceptable since they are expected to be removed in later stage of the
algorithm Researchers have used several segmentation techniques and their combinations.
1. Thresholding Techniques
2. Region-Based Techniques
3. Edge Detection Techniques
4. Hybrid Techniques
In Thresholding Techniques, Global Thresholding, Local Thresholding, Local adaptive
techniques and based on Multiresolution analysis adaptive Thresholding Techniques are
there. Local thresholding is slightly better than global thresholding and Adaptive
thresholding are better than other and this way Adaptive threshold based on
Multiresolution analysis is superior to other methods
According to Zhang and Desai (2001), after the mammograms are wavelet
transformed the gray-level distribution of the target and the background regions of the
images approaches to Gaussian distribution.

Input the digitized image

Pre-processing

Perform wavelet transform of image

Select proper scaling channel

Select adaptive thresholds by looking for local minima


of the wavelet transformed images at different
channels

Using adaptive threshold get


segmented image

Fig. 3.13 Block diagram for adaptive segmentation method adapted by Zhang
The segmentation of possible targets can be modelled by the following classification
problem. For an ideal image I(m,n), there are pixels belonging to two classes: 1) the
background Cb and 2) the target Ct.
Here,
pb(x) PDF of class Cb
P(Cb) a priory probability of class Cb in image I
pt(x) PDF of class Ct
P(Ct) a priory probability of class Ct in image I
Fig3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate P (Cb )pb(x) and P(Ct)pt
(x), respectively. The Bayes threshold _ and the proposed threshold _ are indicated.

Assuming that fb(x) and ft(x) have one point of intersection, as illustrated in Fig. , the
above classifier is equivalent to the following threshold detection criterion:
I(m, n)<λ :I(m, n)ε (Pixel belong to background class)
I(m, n)>λ :I(m, n)ε (Pixel belong to target class )
Where,
fb(λ) = ft(λ)
And segmented image at scale j using Bayes classifier can be expressed as,
Iseg,j(m,n)= 1, Iseg,j(m,n)> λ
=0, Iseg,j(m,n)< λ
Wavelet transforms are used in the new method and the Bayes classifier is
employed for the segmentation problem. An approach for choosing the threshold adaptively
by looking for the global local minima of the PDFs of wavelet transformed images is
proposed. Based on the assumption of Gaussian distributions, the adaptive threshold by the
new method is compared with the Bayes threshold. It is shown that in general practical
cases, the performance of the proposed threshold is often very close to the Bayes threshold,
which is the optimal threshold from the statistical point of view
Fig. 3.15 Bayes threshold λ1 and the proposed candidate threshold λ3 are indicated.

According to Kai Hu et.al (2010),


Zhang and Desai had proved that, when the overlap between pb(x) and pt(x) is not
significant, λ2 is often close to λ1. Hence, it is reasonable to carry out segmentation
according to λ2.However, when pb(x) and pt(x) are not ideal and the overlap between them
is large, the algorithm of Zhang and Desai does not work anymore.
For example, in the case shown in the top image of Fig. 2.5, we cannot determine λ2
by selecting the local minima because λ2 is not the minima on the right of μ1, i.e., the global
maximum in pI (x).In this case, we select the threshold from the derivative PDF curve of pI(x).
The top image of Fig. 2.5 shows the PDF curve of pI (x), and the bottom image of Fig. 2.5
shows the absolute value of derivation of pI (x), i.e., |p’I (x)|, and λ3 is the local minima of
|p’I (x)|. Therefore, we can obtain the segmentation result effectively in this case by using
threshold λ3.
3.3 implemented algorithm

Pre-processed Mammogram

Level 1, DB2 family LL


channel

Histogram

Smoothing (moving
average)

Second derivative

Global Thresholding
Threshold=first zero
using given Threshold
crossing from right side

Segmented Portion

Fig. 3.16 Implemented algorithm for Segmentation Part

3.3 Region selection

From Segmented or detected Portion of mammogram for the perpose of


image extraction,region having more area is selected.For that Binary version of Image is
scanned based on connectivity and number of such connected components are
determined.For all such components LABELS are assigned .

Then Matrix containing Label in place of respective component position is formed.


Now Label having maximum pixels are replaced by ones and other label marked pixels by
zeros ; by using obtained Matrix as mask, we can fixe Lesion part.
Fig.3.17 Segmented Portion and Lesion Part (Selected Region)

Now, Feature Extraction & classification are done using selected Region or Lesion . Here
Radiologist need to take decision whether to send it for feature extraction & classification to
find out type of abnormality i.e. Benign or Maligant. Next Chapter will deal with Remaining
part of proposed CADx algorirhm.

CHAPTER-4
FEATURE EXTRACTION & CLASSIFICATION
After selecting Lesion part as explined in last chapter ; in this chapter Feature
extraction & classification algorithms are explained. In this work five different platforms are
given for classification of Lesion and their classification rates are compared both for Known
database (training using 40 sample mammograms) and Unknown database (Validation
using 65 sample mammogram). This chapter will introduce those platforms and in later part
of thesis we will discuss Results and comparison.

LESION

Shape based Feature


extraction algorithm

Classification based on Discriminant


analysis

Benign or Malignant

Fig 4.1 Abstract Flow of Proposed method

Fig 4.1 shows the algorithm which is followed by all five classification
schemes . Instead of classifying Lesion only based on shape based feature extraction and
classification using DA on Lesion for other 4 platforms shape based feature extraction and
classification using DA are performed on all 4 DWT Level1 channels. And their respected
features are called as feature set (N); where N ranges from 1 to 5.

4.1 Feature Extraction

In this step for given input image containing ROI, following 13 features are extracted.

Set of 13 Feature is nothing but Feature set (N), where N is decided based on input Image
applied,

h
2 Lesion_CA

h 2

g 2 Lesion_CHD

Lesion
Fig 4.2 one stage, 2d- DWT Decomposition with db4 family

Where,

N=1 if input image =Lesion

N=2 if input image =Lesion_CA

N=3 if input image =Lesion_CHD

N=4 if input image =Lesion_CVD

N=5 if input image =Lesion_CDD

Feature set 1: feature extraction on Lesion

Feature set 2: feature extraction on Lesion_CA

Feature set 3: feature extraction on Lesion_CHD

Feature set 4: feature extraction on Lesion_CVD

Feature set 5: feature extraction on Lesion_CDD

Table 4.1: 13 shape based features


Feature extraction:
1. Area(A)=Total pixels in ROI
2. Perimeter(P)=Total pixel in Border of ROI
3. Max Radius(Rmax)=max(Distance(centroid, Border of ROI)
4. Min Radius(Rmin)=min(Distance(centroid, Border of ROI)
5. Convex Areaa = Total number of pixels in 'Convex Image’
6. Euler Number(Eno)= the number of objects in the region minus the number of
holes in those objects
7. Eccentricity(Ect) = (distance between the foci/Major axis length )of ellipse
8. Elongatedness(En) =(Area/(2*Rmax)2
9. Solidity =Area/Convex Area
10. Circularity1(C1) =(Area/pi*Rmax2 )1/2
11. Dispersion(Dp)= Rmax/Area
12. Standard Deviation Of Edge (Esd) =std. dev. of pixel value on border
13. Shape Index(SI) = Perimeter/(2*Rmax)
a. Convex Image — Binary image (logical); the convex hull, with all pixels within the
hull filled in (i.e., set to on).
b. Eccentricity (Ect) = The value is between 0 and 1. (0 and 1 are degenerate cases; an
ellipse whose eccentricity is 0 is actually a circle, while an ellipse whose eccentricity
is 1 is a line segment.)
c. Circularity1 is used to measure shape, reflecting the element’s similarity to circle,
with maximum value approximating 1.0 for circle

4.2 Feature classification

Feature classification is done using Discriminant Analysis (DA). In this method


Canonical discriminant function is determined for extracted features

After extracting features from feature set (N) for all Training cases i.e. mammograms
with known abnormality (we called it as ground truth.) we determined unstandardized
coefficients (N) along with group centroids (N). Now using any platform in other words any
classification scheme (N: 1 to 5); we can determine Discriminant score (N) for unknown
mammogram. And classification is done based on Threshold rule which is obtained using
group centroids (N).

4.2.1 Steps for Canonical Discriminant Analysis:

For n1 Benign & n1 malignant cases which are there in known database or Training cases,

After extracting P features each. We will get 2 feature matrix

Where,

Dimension *A1+ = (n1 x P)……..for n1 Benign cases

Dimension *A2+ = (n1 x P) …...for n1 Malignant cases

=A1G
C=
A2

Dimension *C+ = (2*n1 x P)……… for Total group

M =Mean of C column wise ( or feature wise)…………….(P x 1)

M1 =Mean of A1 column wise ( or feature wise)…………….(P x 1)

M2 =Mean of A2 column wise ( or feature wise)…………….(P x 1)

W = {Covariance (A1) + Covariance (A2)} x (n1-1) …………………………………….Within groups


sums of squares & cross product matrix

T = Covariance (C) x (2*n1 -1)

………………………………… Total sums of squares & cross product matrix

B = T –W

Let W =LLT………………………….. Cholesky decomposition

Then, for A=L-1 B U -1 ……………………

Now if X is Eigen vector corresponds to prominent Eigen value of A

Then, V = U -1X

Unstandardized coefficients D = V x √(2*n1 -1)…………..( P x 1)


Constant D0 = -1 x ( DTM)

Group 1 centroid (Benign group) CBG= constant + ( DTM1)

Group 2 centroid (Malignant group) CMG= constant + ( DTM2)

Threshold Thd =average of group centroids

Now Canonical Discriminant Function can be determined as,

f=D0+XD

Where X is (1xP) feature vector for given mammogram.

4.2.3 Classification based on CDF

Now after substituting Xinput value in obtained CDF we will get finput; this value is
nothing but Discriminant Score (DS) for given input feature vector .

Now,

If DS > Thd

a) Thd < CBG then its classified in Benign group


b) Thd < CMG then its classified in Malignant group

If DS < Thd

a) Thd < CBG then its classified in Malignant group


b) Thd < CMG then its classified in Benign group

Known Database –
20 Benign & 20
Malignant cases

Lesion_CA Lesion_CHD

13 Shape based features 13 Shape based features


Extraction Extraction

- Feature set 2 - Feature set 3

Classification based on CDF Classification based on CDF


Fig 4.3 Steps to determine CDF for all four feature sets extracted from respective channels
in stage 1 of DWT decomposition of Lesion as shown in Fig. 4.2 ( N is 2:5 )

Training Database
– 20 Benign & 20
Lesion Malignant cases

13 Shape based features


Extraction

- Feature set 1

Classification based on CDF

Unstandardized
coefficients(1)
Fig 4.4 Steps to determine CDF for feature set 1 extracted from Lesion (N=1)

Lesion Validation Database

Feature set N extracted

Unstandardized coefficients
(N)
Classification based on CDF

Discriminant score (N)


Threshold (N)
N: 1 to 5

Benign or Malignant

Fig 4.5 Steps involved for classification of given Lesion by determining Discriminant
score (N) - Validation process.

Figure 4.3 & Figure 4.4 describes algorithm adapted while training period towards
obtaining CDF & threshold. Figure 4.5 guides the methodology applied to classify
mammogram after training period is over i.e. testing phase. Now hereby by means of
chapter 3 and chapter 4; I proposed CADx system used for Lesion detection and classification
in mammogram based on Adaptive threshold and DA.

In next chapter we shall see Results and Discussion towards implementation part of
proposed CADx system.
CHAPTER-5
RESULTS AND DISCUSSION

As proposed method was explained and implemented in 2 parts ; this section also
we will discuss in 2 parts; First part will deal with Results Lesion detection & in later results
obtained during classification.

The mammogram images used in this experiment were taken from the mini
mammography database of MIAS (http://peipa.essex.ac.uk/ipa/pix/mias/). All images are
held as 8-bit gray level scale images with 256 different gray levels (0-255) and physically in
portable gray map (pgm) format with size 1024 pixels x 1024 pixels.
5.1 Lesion detection

For conduction of Part I, 100mammograms from database selected. In which 35


Normal, 38 Benign & 27 malignant cases are considered.
All mammograms are of MLO view; so it was necessary to perform Pre-processing to
remove background portion like Label, Pectoral muscle or rib portion.

LL1
LH1

HL1 HH1

Fig 5.1 :( MDB063): First stage DWT decomposition of Pre-processed image


with db2 family
Fig 5.2: Adaptively chosen
Threshold for 27 malignant
cases

Fig 5.3: Adaptively chosen


Threshold for 38 benign cases

Fig 5.4: Adaptively chosen


Threshold for 35 Normal cases

Global adaptive threshold segmentation described in previous chapter is applied on


given pre-processed mammogram; to obtain Threshold 1.Figure 5.1 shows enhanced version
of first stage DWT Decomposition of given pre-processed mammogram.
Figure 5.2 to 5.3 shows Threshold obtained in trial mammograms; which also signifies that
variance from average value is increases and smallest threshold also gets decreased as we
move from Malignant to Benign and from Benign to Normal case.

In Next part 3 mammograms of type Normal, Benign and Malignant; namely MDB135,
MDB226 & MDB115 are used as examples for Lesion detection.

Now to calculate Threshold 2, we need to scale result according max pixel value present in
original image. To find out Threshold 1 LL1 image is enhanced such a way that its minimum
pixel value will be zero and maximum is 255 (fig. 5.1);

And Now again to correlate that threshold in special domain of given mammogram.

Threshold 2=Threshold 1*(maximum pixel value in pre-processed mammogram)/255

5.1.1 Normal Case: MDB 135 (LMLO view)

Fig 5.5: (MDB135) original mammogram with its skin line


Fig. 5.6 (MDB135) Derivative plot of histogram of LL1 channel of given mammogram-
Threshold 1=206

Threshold 2= Threshold 1*235/255= 206*235/255 = 190

Fig. 5.7 (MDB135) Histogram of given mammogram-Threshold 2=190

Fig 5.8 (MDB 135) Region selection

5.1.2 Benign case: MDB 226


Fig 5.9: (MDB226) original mammogram with its skin line

Fig. 5.10 (MDB226) Derivative plot of histogram of LL1 channel of given


mammogram-Threshold 1=224

Threshold 2= Threshold 1*235/255= 224*215/255 = 189


Fig. 5.11 (MDB226) Histogram of given mammogram-Threshold 2= 189

Fig 5.12 (MDB 226) Region selection


5.1.3 Malignant case: MDB 115

Fig 5.13: (MDB115) original mammogram with its skin line

Fig. 5.14 (MDB115) Derivative plot of histogram of LL1 channel of given mammogram-
Threshold 1=231
Threshold 2= Threshold 1*235/255= 231*225/255 = 204

Fig. 5.15 (MDB115) Histogram of given mammogram-Threshold 2=204

Fig. 5.16(MDB115) Region selection


5.2 Feature extraction and classification

As per notations in previous Chapter, here P=13 features & n1=20 cases. Now for
performing DA or evaluating CDF for all five feature set training dataset of 40
mammograms are used; from which lesion part is detected first, such 20 benign and 20
malignant cases (detected Lesion part) are shown in Figure 5.17 and 5.18 respectively.
For validation part classification is done based on discriminant score obtained and
Threshold corresponding to respected platform (N 1 to 5) which is chosen for feature
extraction and classification. For validation database 38 Benign and 27 malignant cases
from MIAS database are considered ,which we previously used for Lesion detection
algorithm. In training part for one case in each group feature vectors corresponding to
each feature set is extracted. Which are tabled in Table 5.1 and Table 5.2.

Now Secction 5.2.1 to 5.2.5 summarizes the result obtained in Feature classification
part for respective platforms (N=1 to 5). Where to evaluate feature set N, 13 features
are extracted from corrosponding input image.

One can refer figure 5.20 ,MDB028 a mammogram of malignant case .

Lesion is used as input for N=1;

Lesion_ca is used as input for N=2;

Lesion_chd is used as input for N=3;

Lesion_cvd is used as input for N=4;

Lesion_cdd is used as input for N=5;

And you will come up with Feature vectors, column wise in Table 5.2.

Now select your N, i.e. your platform and by adapting procedure explained in previous
Chapter you can form matrix A1 and A2 and by performing steps described earlier
Group centroids & Canonical Discriminant function can be identified, by knowing
unstandardized coefficients and constant.
20 Benign cases

Fig. 5.17 Lesion part detected from training dataset of 20 Benign mammograms

20 Malignant cases
Fig 5.18 Lesion part detected from training dataset of 20 malignant mammograms
Fig 5.19 Lesion and its one stage level-1 DWT decomposition for mdb002 using db4.

Table 5.1: mdb002 extracted features (Benign group)

Feature set Feature set Feature set Feature set Feature set
Feature 1 2 3 4 5
Area 5599 2098 2098 2098 2098
Perimeter 719.74 279.14 279.4 279.14 279.14
rmin 13.11 6.19 3.81 1.23 15.08
rmax 105.37 51.37 20.22 2.02 50.44
convexarea 9026 2778 2788 2778 2788
eno -10 0 0 0 0
ect 0.97 0.96 0.96 0.96 0.96
en 0.13 0.2 1.28 128.5 0.21
solidity 0.62 0.76 0.76 0.76 0.76
c1 0.4 0.5 1.28 12.79 0.51
dp 0.02 0.02 0.01 0 0.02
esd 111.96 159.17 35.07
Lesion_Ca 41.41 26.58
Lesion_Chd
si 3.42 2.72 6.9 69.08 2.77

Lesion_Cvd Lesion_Cdd

Fig 5.20 Lesion and its one stage level-1 DWT decomposition for mdb028 using db4.

Table 5.2: mdb028 extracted features (Malignant group)

Feature Feature set 1 Feature set 2 Feature set 3 Feature set 4 Feature set 5
Area 6153 1893 1893 1983 1893
Perimeter 336.78 171.88 171.8 171.88 171.88
Rmin 34.81 10.28 0.93 0.92 18.31
Rmax 53.97 34.3 3.31 12.27 26.75
Convex area 6590 1988 1988 1988 1988
Eno 1 1 1 1 1
Ect 0.54 0.53 0.53 0.53 0.53
En 0.53 0.4 43.22 3.14 0.66
Solidity 0.93 0.95 0.95 0.95 0.95
C1 0.82 0.72 7.42 2 0.92
Dp 0.01 0.02 0 0.01 0.01
Esd 97 21.2 29.88 31.35 135.24
Si 3.12 2.51 29.57 7.01 3.21
5.2.1 Feature Classification –Feature set 1

Table 5.3: Unstandardized coefficients for Lesion

Unstandardized
coefficients
Area 0.000136053
Perimeter 0.002093029
rmin 0.032719388
Benign Group Centroid = 1 &
rmax -0.006014262
Malignant Group Centroid = - 1
convexarea -0.000136239
eno 0.048984735
ect 0.229593731
en 1.948941623
solidity 6.236294906
c1 -18.67941482
dp 50.23437946
esd -0.074649649
si 0.574652977
constant 10.25357206
Threshold 0
Discriminant score is calculated using unstandardized coefficients and extracted
feature set.For MDB028: Feature set 1(Table 5.1 first column)

DS= {6153*0.0013 + 336.78*0.0021 + 34.81*0.00327 -0.06*53.97 -


0.00013*6590+0.048984*1+0.2295*0.54+1.9489*0.53+6.236*.93-
18.67*0.82+50.2343*0.01-.0746*97+.574*10.2535}

DS=-1.58861251052099

Now, (DS< Threshold) which belongs to Malignant group as its centroid is < Threshold.
Therefore given mammogram is classified in malignant group, which holds true as per
ground truth.

Fig 5.21: CDF histogram plot for a) Benign & b) Malignant groups using feature set-1

TABLE 5.4: Classification Results (b,c) (N=1)

Predicted Group
Membership Total
type .00 1.00 .00
Original Count .00 16 4 20
1.00 4 16 20
% .00 80.0 20.0 100.0
1.00 20.0 80.0 100.0
Cross- Count .00 14 6 20
validated(a) 1.00 6 14 20
% .00 70.0 30.0 100.0
1.00 30.0 70.0 100.0

 80.0% of original
grouped cases
correctly classified.
 70.0% of cross-
validated grouped
cases correctly
classified.

a Cross validation is done only for those cases in the analysis. In cross validation, each case is
classified by the functions derived from all cases other than that case.

Classification rate

(Validation rate)

= 63.079%
Fig 5.22: Classification result for validation dataset (unknown mammograms) (N=1)

Fig. 6.5 shows classification result in for validation dataset having 65


mammograms with 38 Benign and 27 malignant cases. The plot is function of following
function

F=R-GT

Where R is Result obtained after classification and GT is ground truth of abnormality.

For both R & GT: 0 means Benign & 1 means malignant.

Middle graph indicates (TP+TN) (True results F=0)

Left graph indicates FN (false negative F=-1) i.e. misinterpreted as Benign

Right graph indicates FP (false positive F=1) i.e. misinterpreted as Malignant

And Classification rate = 100*(TP+TN)/ (TP+TN+FP+FN) %

5.2.2 Feature Classification –Feature set 2

Table 5.5: Unstandardized coefficients for Lesion_Ca

Unstandardized
coefficients
Area -0.001296512
Perimeter -0.007166919
Rmin 0.05612585
Rmax -0.052294284
convexarea 0.001359857
Eno 0.126160304
Ect -3.019447119
en -2.244389496
solidity 14.81575929
c1 -5.089357262
dp -148.6243504
esd 0.056719493
si 1.327386923
constant -13.20729376
Threshold 0

Table 5.6: Classification Results (N=2) Benign Group Centroid = 1.08

Malignant Group Centroid = - 1.08

Predicted Group
Membership Total 87.5% of original grouped cases
type .00 1.00 .00 correctly classified.
Original Count .00 18 2 20
1.00 3 17 20
% .00 90.0 10.0 100.0 72.5% of cross-validated groupe
1.00 15.0 85.0 100.0 cases correctly classified.
Cross- Count .00 14 6 20
validated(a) 1.00 5 15 20
% .00 70.0 30.0 100.0
1.00 25.0 75.0 100.0

Fig 5.23: CDF histogram plot for a) Benign & b) Malignant groups using feature set-2
Classification rate

(Validation rate)

= 33.84%

Fig 5.24: Classification result for validation dataset (unknown mammograms) (N=2)

5.2.3 Feature Classification –Feature set 3

Table 5.7: Unstandardized coefficients for Lesion_Chd

Unstandarized
coefficients
Area 0.000513
Perimeter -0.00761
rmin 0.075119
rmax -0.01532 Benign Group Centroid = 1.8
convexarea -0.00028
eno 0.023116 Malignant Group Centroid = - 1.8
ect 2.107331
en 0.009295
solidity -12.1234
c1 -0.82893
dp 143.4587
esd 0.029166
si 0.129811
constant 8.274857
Threshold 0

Table 5.8 Classification Results (N=3)

Predicted Group  97.5% of original


Membership Total
grouped cases
type .00 1.00 .00
correctly
Original Count .00 19 1 20
1.00 0 20 20 classified.
% .00 95.0 5.0 100.0
 85.0% of cross-
1.00 .0 100.0 100.0
Cross- Count .00 16 4 20 validated grouped
validated(a) 1.00 2 18 20
cases correctly
% .00 80.0 20.0 100.0 classified.
1.00 10.0 90.0 100.0

Fig 5.25: CDF histogram plot for a) Benign & b) Malignant groups using feature set-3
Classification rate

(Validation rate)

= 67.69%

Fig 5.26: Classification result for validation dataset (unknown mammograms) (N=3)

5.2.4 Feature Classification –Feature set 4

Table 5.9: Unstandardized coefficients for Lesion_Cvd

Unstandarized
coefficients
Area -0.00019
Perimeter -0.01045 Benign Group Centroid = 0.96
rmin 0.094519
rmax -0.02209 Malignant Group Centroid = - 0.96
convexarea 0.000349
eno -0.06987
ect 4.259276
en 0.007625
solidity -3.66475
c1 -0.79561
dp -85.9958
esd -0.02715
si 0.14592
constant 3.963368
Threshold 0

Table 5.10 Classification Results (N=4)

Predicted Group  82.5% of original


Membership Total grouped cases
type .00 1.00 .00 correctly classified.
Original Count .00 19 1 20
1.00 6 14 20  60.0% of cross-
% .00 95.0 5.0 100.0 validated grouped
1.00 30.0 70.0 100.0
Cross-validated Count .00 12 8 20 cases correctly
1.00 8 12 20 classified.
% .00 60.0 40.0 100.0
1.00 40.0 60.0 100.0

Fig 5.27: CDF histogram plot for a) Benign & b) Malignant groups using feature set-4
Classification rate

(Validation rate)

= 69.23%

Fig 5.28: Classification result for validation dataset (unknown mammograms) (N=4)

5.2.5 Feature Classification –Feature set 5


Table 5.11: Unstandardized coefficients for Lesion_Cdd

Unstandarized
coefficients
Area -0.000324559
Perimeter -0.017567012 Benign Group Centroid = 1.44
rmin 0.21035515
rmax -0.098717383 Malignant Group Centroid = - 1.44
convexarea 0.000721538
eno -0.674295328
ect 4.619567598
en -0.008232511
solidity -18.20872834
c1 0.649064184
dp 64.58390762
esd -0.065053371
si -0.087046045
constant 16.34716167
Threshold -1.78E-15

Table 5.12 Classification Results (N=5)

Predicted Group
Membership Total  92.5% of original
grouped cases
type .00 1.00 .00
Original Count .00 19 1 20 correctly classified.
1.00 2 18 20
 80.0% of cross-
% .00 95.0 5.0 100.0
1.00 10.0 90.0 100.0 validated grouped
Cross- Count .00 17 3 20 cases correctly
validated(a) 1.00 5 15 20
% .00 85.0 15.0 100.0 classified.
1.00 25.0 75.0 100.0

Fig 5.29: CDF histogram plot for a) Benign & b) Malignant groups using feature set-5
Classification rate

(Validation rate)

= 75.38%

Fig 5.30: Classification result for validation dataset (unknown mammograms) (N=5)

Table 5.13: summaries of all results for classification

Cross-validated
original grouped grouped case Validation
classification rate classification rate classification rate
Feature set
1 80 70 63.08
Feature set
2 87.5 72.5 33.84
Feature set
3 97.5 85 67.69
Feature set
4 82.5 60 69.23
Feature set
5 92.5 80 75.38
CHAPTER 6

CADx USER INTERFACE

After completing training part using CDF functions and respective thresholds one can
classify given mammogram’s Lesion part either Benign or Malignant.

Our CADx will allow radiologist to select any one of the classification scheme.

Where decision can be taken by giving different weightage to different schemes and finally
one can conclude.

All Front end and back end algorithms are implemented in MATLAB environment.
Figure 6.1 shows GUI of developed CADx system.

Matlab ver.7 is used in which Image processing, wavelet and signal processing toolboxes are
used.

Apart from that for Discriminant analysis SPSS package ver. 14 is used whose results
matches with one with Matlab implementation, but SPSS software found to be more faster
and user friendly.
Select mammogram
Lesion

Select classification

Scheme

Result:

Type of abnormality
Fig. 6.1 GUI interface for CADx (Trained).

CHAPTER-7
CONCLUSION AND FUTURE WORK

Proposed CADx system is implemented with the help of Matlab toolbox.Figure 7.1
compares classification rates obtained by using DA on respective feature set. It is clearly
visible that feature set 3 & 5, which are nothing but 13 shape based features extracted
from Horizontal and Diagonal detail channels obtained , after one stage DWT
decomposition using db4 wavelet family of Lesion in given mammogram; gives better
result rather than simple feature extraction using Lesion itself (feature set 1).
120

100

80
original grouped classification
rate
60
Cross-validated grouped case
classification rate
40 Validation classification rate

20

0
Feature set Feature set Feature set Feature set Feature set
1 2 3 4 5

Fig. 7.1 Comparison chart for classification of Lesion using DA on Feature set 1 to 5

As part of further work, one can explore different mother wavelet or wavelet
families so as to improve classification rate. Also, instead of using statistical method of
classification ANN model can be generated.

REFERENCES

1. William R. Klecka (1980), Discriminant Analysis ,sage university paper

2. SPSS ver. 14 manual on algorithms titled “Discriminant”

3. Ingrid Daubechies (1992) “Ten Lectures on Wavelets”, society for industrial and
applied mathematics.

4. Olivier Rioul (1993); A Discrete-Time Multiresolution Theory, IEEE Trans. On signal


processing vol. 41, no. 8, pp. 2591-2605.
5. Xiao-Ping Zhang and Mita D. Desai (2001);Segmentation of Bright Targets Using
Wavelets and Adaptive Thresholding, IEEE Transactions on Image Processing, vol.
10,no. 7, July 2001

6. H.D. Cheng, Xiaopeng Cai, Xiaowei Chen, Liming Hu, Xueling Lou (2003) ;
Computer-aided detection and classification of Micro calcifications in mammograms:
a survey, Pattern Recognition 36 -2967-2991

7. Gonzalez R. C., Woods R. E., Eddins S. L. (2004), Digital image processing using
MATLAB.

8. Alfonso Rojas Dominguez & Asoke K. Nandi (2008); Detection of masses in


mammograms via statistically based enhancement, multilevel-thresholding
segmentation, and region selection, Computerized Medical Imaging and Graphics 32,
pp.304-315

9. Jelena Bozek , Mario Mustra ,Kresimir Delac and Mislav Grgic (2009); A Survey of
Image Processing Algorithms in Digital Mammography, Recent Advan. In Mult. Sig.
Process. And Communication, SCI 231, pp. 631-657.

10. B. Surendiran et.al (2009); Classifying Digital Mammogram Masses using Univariate
ANOVA Discriminant Analysis. Int .Conf. on Advances in Recent Technologies in
Communication and Computing.

11. Kai Hu et.al (2010); Detection of Suspicious Lesions by Adaptive Thresholding Based
on Multiresolution Analysis in Mammograms, IEEE Trans. On Instrument and