You are on page 1of 9

Advanced statistical techniques applied to comprehensive FTIR spectra

on human colonic tissues


A. Zwielly and S. Mordechaia
Department of Physics and the Cancer Research Center, Ben-Gurion University (BGU), Beer-Sheva 84105,
Israel
I. Sinielnikov
Department of Pathology, Soroka University Medical Center (SUMC), Beer-Sheva 84105, Israel
A. Salman
Department of Physics, Sami Shamoon College (SCE), Beer-Sheva 84100, Israel
E. Bogomolny
Department of Physics and the Cancer Research Center, Ben-Gurion University (BGU), Beer-Sheva 84105,
Israel
S. Argov
Department of Pathology, Soroka University Medical Center (SUMC), Beer- Sheva 84105, Israel
Received 16 June 2009; revised 27 December 2009; accepted for publication 30 December 2009;
published 9 February 2010
Purpose: Colon cancer is a major public health problem due to its high disease rate and death toll
worldwide. The use of FTIR microscopy in the field of cancer diagnosis has become attractive over
the past 20 years. In the present study, the authors investigated the potential of FTIR microscopy to
define spectral changes among normal, polyp, and cancer human colonic biopsied tissues.
Methods: A large database of FTIR microscopic spectra was compiled from 230 human colonic
biopsies. The database was divided into five subgroups: Normal, cancerous tissues, and three stages
of benign colonic polyps, namely, mild, moderate, and severe polyps, which are precursors of
carcinoma. All biopsied tissue sections were classified concurrently by an expert pathologist. The
authors applied the principal components analysis PCA model to reduce the dimension of the
original data size to 13 principal components.
Results: While PCA analysis shows only partial success in distinguishing among cancer, polyp, and
the normal tissues, multivariate analysis e.g., LDA shows a promising distinction even within the
polyp subgroups.
Conclusions: Good classification accuracy among normal, polyp, and cancer groups was achieved
with a success rate of approximately 85%. These results strongly support the potential of develop-
ing FTIR microscopy as a simple, reagent-free tool for early detection of colon cancer and, in
particular, for discriminating among the benign premalignant colonic polyps having increasing
degrees of dysplasia severity mild, moderate, and severe. 2010 American Association of
Physicists in Medicine. DOI: 10.1118/1.3298013

Key words: FTIR microscopy, polyps, colon cancer, PCA, LDA

I. INTRODUCTION growths are usually benign, but some may develop into can-
cer over time. The ability to classify these polyps in time
Colon cancer is a major public health problem due to its
could provide warning for their development into cancer.
widespread occurrence and death toll worldwide. According
Even within diagnosed cancer cases, the ability to classify
to the estimation of the National Cancer Institute, approxi-
mately 108 070 colon and 40 740 rectal cases were reported between early and severe cases is highly important and could
in 2008, of which 49 960 caused mortality. Of the estimated influence the treatment strategy.
5.2 106 mortalities from cancer per year throughout the The use of FTIR microscopy in the field of cancer diag-
world, 655 000 cases are caused by colorectal malignancies. nosis has shown encouraging trends over the past 20 years.2
It is the second leading cause of cancer-related death in the The wavelength of infrared radiation which is absorbed de-
Western world.1 Despite the improvement in diagnostic tech- pends on the nature of the covalent bond i.e., atoms in-
niques, more than 90% of colon cancer cases have either volved and the type of bond and the strength of any inter-
advanced or metastasized by the time they are diagnosed. molecular interactions van der Waals interactions and H
Hence, there is an urgent need to develop novel digital diag- bonding.3 Various biomolecular components of the cell give
nostic methods to detect the malignancy in the earliest stage a characteristic IR spectrum, from which structural and func-
possible. Many colorectal cancers are thought to arise from tional aspects4 of the cell can be inferred. The differences in
adenomatous polyps in the colon. These mushroomlike the absorbance spectra in the mid-IR region between normal

1047 Med. Phys. 37 3, March 2010 0094-2405/2010/373/1047/9/$30.00 2010 Am. Assoc. Phys. Med. 1047
1048 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1048

and abnormal tissues have been shown to be possible criteria


for detection and characterization of various types of cancer
such as breast,5 leukemia,6 cervical,7 skin,8 brain,9 prostate,10
and also neck and head tumors.11
After approximately 20 years of using IR and FTIR spec-
troscopy for diagnostic purposes, this field of research is now
challenged with new frontiers. In the past few years hard-
ware innovations have accelerated. This includes different
mobile facilities such as portable attenuated total reflectance
devices as well as optical fiber sensors adjusted to the same
FTIR spectrometer basic principles.12,13 These improvements
require a revolution in the necessity of implementing new
statistical and mathematical algorithms adequate to the po- FIG. 1. Histological images of formalin-fixed human colonic tissues stained
tential of modern instrumentation. Well established as well as with hematoxylin-eosin. a Normal, b mild, c moderate, and d severe
new analyses are being constantly improved and adapted. benign polyps, e carcinoma. Bars represent a length of 1500 m.
Developing system approaches that incorporate the different
stages of the spectral analysis is essential for quick and reli-
able automatic classification between the various groups.
One of the promising techniques is the artificial neural net- tion of the glands. The epithelial cells are large with high
works that has previously been applied successfully on colon nuclear/cytoplasmic ratio. The nuclei of the malignant cells
cancer.14,15 Advanced statistical analysis also shows good re- stratify through the epithelial layer up to the luminal surface
sults in melanoma studies, where malignant neoplasms of and show a number of mitotic figures and individual necrotic
epidermal melanocytes were successfully differentiated from cells. Glands are embedded in desmoplastic stroma. In sum-
nevus based on Gaussian distribution of several unique spec- mary, Fig. 1 displays the gradual changes in the tissue mor-
tral parameters biomarkers utilized to classify the two phology encountered during the multistep process in which
cases.16 normal colonic tissues progress toward malignancy.
Colon cancer, when detected in the early stages, is one of
the most curable cancers. Treatment is mainly surgical in
which the cancerous section of the bowel is removed. Sur-
II.B. Sample preparation
gery is followed by chemotherapy and radiotherapy.17
As in the case of melanoma where full recovery can be The method described by Argov et al.18 was followed for
achieved if the tumor is removed before metastases evolve, sample preparation. Formalin-fixed, paraffin-embedded tis-
colonic polyps are an indicator for early dysplastic stages. sues from adenocarcinoma patients were retrieved from the
Thus, grading of premalignant colonic polyps, digitally histopathology files of Soroka University Medical Center,
and systematically, could lead to economic and practical re- Beer-Sheva, Israel. The tissue samples used in this study
lief for many patients as well as medical staff. Currently, were selected with the concession of the patients and under
ascribing a grade to premalignant polyps i.e., mild, moder- the institutions Helsinki committee approval to include
ate, and severe is still controversial even for expert patholo- both normal, three grades of benign polyps mild, moderate,
gists. and severe, and malignant sites. Two consecutive paraffin
sections were cut from each biopsy; one was placed on a zinc
II. MATERIALS AND METHODS selenium slide and the other on a glass slide. This procedure
was performed carefully to assure that the two tissue sections
II.A. Malignant tissues characteristics
were identical. Thickness of all tissue samples was 10 m.
Figure 1 presents typical histological images of the five The first slide was deparaffinized using xylol and alcohol and
groups of human formalin-fixed tissues included in the was used for FTIR measurements. The second slide was
present study: a Normal colon histology showing flat mu- stained with hematoxylin and eosin for histology review by
cosal surface and abundant vertically oriented crypts lined by an expert pathologist.
columnar epithelium. The mucosal crypts are lined up in The measurement sites were chosen carefully by an expert
parallel. b and c Mild and moderate-low grade polyps, pathologist to include the proper epithelial cells on the tissue
respectively. Low grade dysplastic crypts are characterized cross section. For example, when we measured normal tis-
by partial loss of cell polarity and reduced goblet cells. Dys- sues, we chose areas with normal cells only and minimizing
plastic nuclei are stratified and polarized to the lower half of extra cellular contents such as mucine. The same procedure
the epithelial layer. d Severe high grade polyp. High grade was followed also with the abnormal tissues.
dysplasia is characterized by marked nuclear stratification Our database is composed of 78 patients. From the biop-
and high nuclear/cytoplasmic ratio. Dysplastic nuclei are sies, we extracted the following: 103 normal tissues, 29 mild
stratified through the epithelial layer up to the luminal sur- polyps, 41 moderate polyps, 31 severe polyps, and 26 carci-
face. e CarcinomaAdenocarcinoma of colon showing noma tissue sections. The total number of individual micro-
malignant glands, with variability in the size and configura- scopic spectra analyzed was approximately 800.

Medical Physics, Vol. 37, No. 3, March 2010


1049 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1049

baseline correction using the rubber band algorithm, vector


normalization, and a second baseline correction handling
constant shifts. Reduction in the data was done by principal
component analysis PCA.20 IR spectra of eukaryotic cells
are defined by roughly 500 variables wavenumbers. To re-
duce this number, PCA was performed. Basically, PCA is a
mathematical algorithm that reduces the dimension of the
problem that is being dealt with. In other words, instead of
using many variables, the variability in the data is described
by only few PCs. The reduction is achieved by finding the
correlation between the variables. A covariance matrix is be-
ing and the eigenstates PCs and the eigenvalues propor-
tional to the variability included in each PC are extracted.
The first linear combination is called the first principal com-
ponent PC1 and contains, in our case, approximately 55%
of the variance. The second principal component PC2 is a
linear combination of wavenumbers, which accounts for
most of the residual variance and is perpendicular to the first
one. The subsequent principal components obey the same
rules. This method allows the reduction in our spectra to 13
variables the first 13 principal components that account for
almost 100% of the variance.21 Following the PCA, a linear
discriminant analysis LDA was performed.2224 LDA is a
classification technique that employs Mahalanobis distance
to determine the class of an unknown sample. In order to
classify between the different groups a classification criterion
is determined,

f i = iC1xTk 21 iC1Ti + lnpi, 1

where i is the data mean designated class i and C is the


FIG. 2. Schematic diagram of the spectral analysis strategy based on MATLAB covariance matrix. Each element in C is given by
utilized in the present work. The preprocessing block starts with the raw
spectrum. The spectrum was cut into three regions, rubber band baseline
corrected, normalized, and constant baseline corrected. PCA analysis was Cnn = Ci,j,Ci,j = covdimi,dim j. 2
applied on the preprocessed spectra followed by LDA as represented by the
right branch. The left branch represents the DCF analysis constructed of six pi is the prior probability of a measurement belonging to
selective biomarkers chosen based on the spectra t-test results.
group i. In our case, we assumed that pi is proportional to the
number of samples in each group. The second term
iC1Ti is the Mahalanobis distance, which is a measure
II.C. Fourier transform infrared microscopy
of dissimilarity between several groups. The class to which a
measurements
measurement belongs was determined by its largest f i value.
Microscopic FTIR measurements in transmission mode Training and test sets were selected randomly from the
were performed using the IRscope II FTIR microscope with database. 50% of each set was employed for training and the
a sensitive liquid nitrogen-cooled mercury cadmium telluride remainder for the test. In addition, the validation experiment
detector, coupled to the FTIR spectrometer Bruker Equinox was repeated 100 times, with the same input features but
model 55, OPUS Software. The measured sites were circu- with different sets of randomly selected training and test sets
lar with a diameter of 100 m. For each biopsy, at least and the results were averaged.
three measurements at different locations were acquired and
the average spectra were analyzed. The spectrum at each site
was the average of 128 coadded scans to increase the signal
II.E. Discriminant classification function DCF and
to noise ratio. the t-test

DCF is a statistical tool that enables to improve classifi-


II.D. Spectral analysis
cation between gradual evolved subgroups simultaneously
Figure 2 summarizes the procedure used to process the using several spectral variants. DCF generates a classifica-
measured data. All analysis was done by our in-house codes tion score for each group that is a linear combination of a
developed using MATLAB software.19 Preprocessing of the previously derived array of biomarkers with weight coeffi-
data includes bisection of the spectrum into three regions, cients given by the following equation:

Medical Physics, Vol. 37, No. 3, March 2010


1050 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1050

baseline contribution underneath each band. Table I summa-


rizes the average values of the prominent bands and their
standard errors, as well as the t-test values between the five
tissue stages. The t-value for this ratio between the two ex-
treme groups of normal and carcinoma is more than 24.
Therefore, this ratio may be considered as a satisfactory
biomarker for the classification between these two extreme
cases. The t-test values in Table I reveal that this ratio is also
significant for the polyps groups as well.
The variation in the phosphate level, measured by inte-
grating the absorbance of the symmetric 1000 1150 cm1
band for the different cases are presented in Fig. 4. On aver-
age, the phosphate levels for polyp and malignant tissue
samples were lower than for the normal group. However, the
average absorbance of polyp and malignant samples were
almost equal. The asymmetric 1170 1310 cm1 band shows
the same trend not shown in the figure. Two distinctive
FIG. 3. FTIR spectra at the 2800 3000 cm1 region. regions are shown in this figure: The 1200 1800 cm1 spec-
tral region and the 1000 1200 cm1 spectral region. Both
regions, separately, were vector normalized. The region
S = c + w1 x1 + w2 x2 + . . . + wi xi + . . . , 3 1500 1800 cm1 is almost solely subjugated to the
where S denotes the resultant classification score, c is a con- conformation-sensitive amide I and amide II bands, which
stant, and wi is the weight coefficient given by are the most dominant bands in the spectra of nearly all
complex biological systems.25 The intensity differences be-
wi = xi

t _ valuei
, 4 tween normal, polyp, and cancerous tissues in the amide II
band were not significant for all cases. Amide I is among the
bands which slightly shift between the various groups. In
where xi is the biomarker value, is a constant, and t _ valuei
particular, the normal group was lower and wider with re-
was defined as the paired t-value of each biomarker among
spect to the other groups. Since amide I arises from the
normal and abnormal tissues.
C v O hydrogen bonded stretching vibrations, these may
The constants c and were chosen in such a way as to
arise due to biochemical alterations conformation and com-
nullify the average classification scores of the normal group
position in protein and/or nucleic acids, respectively.
and give a score of 100 for the cancerous human colonic
Another important biomarker can be obtained from the
tissues.
shoulder near 1740 cm1, resulting primarily from C v O
The t-test values were considered significant at P 0.05.
stretching vibrations of the ester functional groups in
phospholipids.26 The lipids in the membrane are composed
III. RESULTS mainly of phospholipids that determine membrane structure,
stability, fluidity, and membrane enzymatic activity. Figure 4
III.A. FTIR microscopy spectra of tissues
shows gradual intensity changes in the 1740 cm1 band with
Figure 3 shows the average spectra in the region irregularity for the cancer, where its value is above the mod-
2800 3000 cm1. This wavenumbers region was cut from erate and severe polyps. Significant higher intensity is no-
the entire spectrum, normalized and baseline corrected. The ticed for the normal group as can be seen in Table I. The
results Fig. 3 exhibit four prominent absorbance bands: weaker amino acid side chain from peptides and proteins at
Near 2848 cm1, due to the symmetric stretching of the me- 1456 and 1401 cm1 are associated with the asymmetric and
thylene chains in membrane lipids; at 2872 cm1 arising symmetric CH3 bending vibrations.27 The absorption peak at
from the symmetric CH3 methyl stretching; at 2918 cm1 1243 cm1 is due to the PO2 ionized asymmetric
due to the antisymmetric CH2 stretch; and at 2958 cm1 due stretching.28 The absorption due to normal tissue was larger
to antisymmetric stretching of the methyl groups of both than for polyps and cancerous types in this entire region for
lipids and proteins. The average absorption intensities of the the averaged spectra. In the case of the 1401 cm1 band, a
different tissues are distinctive at 2848 and at 2958 cm1 significant shift can be noticed for the normal tissue.
bands. The average values of these bands indicate a gradual The 1000 1140 cm1 region in Fig. 4 contains many
intensity change, where the normal group has the lowest in- overlapping vibrational modes associated with absorbance of
tensity and the cancer has the highest intensity in the macromolecules such as proteins, nucleic acids, carbohy-
2958 cm1 band and vice versa for the 2848 cm1 band. drates, and phospholipids. Substantial differences appeared
Thus the best discriminating values were obtained by deriv- between the normal tissue spectra, the polyps, and carci-
ing the intensity ratio of these two vibrational modes i.e., noma, while only mild differences are apparent in the tran-
A2848 / A2958 or vas CH3 / vs CH2. This dimensionless ratio sition between polyps and carcinoma. Changes in this spec-
eliminates a possible artifact, which may arise due to the tral range between the five groups exist almost in all

Medical Physics, Vol. 37, No. 3, March 2010


1051 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1051

TABLE I. Average and STD of selected biomarkers are represented in the top section. The bottom section
contains t-test values for all six biomarkers and tissue combination pairs. Each square contains six boxes
corresponding to the six biomarkers in the upper part. The nonsignificant values are marked as NS.

Average values and std deviations

Normal Mild Moderate Severe Cancer


Polyp Polyp Polyp
A2848 / A2958 2.460.38 1.870.39 1.600.26 1.200.21 0.730.12

1740 cm-1
1.220.32 0.640.17 0.470.14 0.430.13 0.500.17
(x100)

A1083 / A1056 0.840.04 0.990.05 1.060.07 1.130.06 1.070.07

1025 cm-1 0.200.01 0.160.02 0.130.01 0.120.01 0.120.02

1045 cm-1 0.340.01 0.280.01 0.250.01 0.250.01 0.250.02

A1121 / A1015 1.880.31 2.480.52 3.170.65 3.551.04 3.481.38

The t-values for the above selected six biomarkers

Normal Mild Moderate Severe Cancer


Polyp Polyp Polyp
5.9
7.6
Mild 12.7
12.8
13.8
6.0
12.7 3.1
14.2 4.0
Moderate 19.5 4.3
27.1 7.2
30.1 8.6
12.9 4.1
12.1 5.8 5.2
8.9 3.8 0.9 (NS)
Severe 19.7 6.9 3.2
21.0 7.1 2.1 (NS)
20.8 7.0 0.9 (NS)
10.1 3.9 1.6 (NS)
24.9 15.2 17.1 9.3
11.5 2.8 0.8 (NS) 1.4 (NS)
Cancer 18.4 4.5 0.8 (NS) 2.3 (NS)
22.5 6.6 1.7 (NS) 0.4 (NS)
22.8 6.6 1.2 (NS) 0.2 (NS)
8.1 3.1 1.3 (NS) 0.2 (NS)

wavenumbers. The bands at 1083 and 1056 cm1 correspond but in reverse order, hence the A1083 / A1056 ratio was consid-
to absorbance of the vs PO2 of phosphodiesters of nucleic ered significant Table I. In IR spectra, the bands at 1025
acids28 and the O u H stretching coupled with C u O bend- and 1045 cm1 correspond to the vibrational modes of
ing of C u OH groups of carbohydrates, respectively.29 uCH2OH groups and the C u O stretching coupled with
These two biomarkers show the same absorbance intensity C u O bending of the C u OH groups of carbohydrates in-

Medical Physics, Vol. 37, No. 3, March 2010


1052 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1052

FIG. 5. DCF of normal, polyps mild, moderate, and severe and cancer
tissues. Each class is represented by an array of average values of four
biomarkers.

FIG. 4. Important biomarkers are marked in the FTIR spectra at the each group based on Eq. 3. It can be noted that the mild
1000 1800 cm1 fingerprint region. While good classification for normal and moderate polyps have similar scores, which means that
tissues is apparent, only small changes are noticed in this region among the
other groups. The shaded region represents the asymmetric phosphate
only small detectable spectral changes occur between mild
biomarker. and moderate polyps as would be expected. Generally, the
score values of the tissues starting with the normal group
gradually approach the spectral values of the malignant
cludes glucose, glycogen, etc. Higher intensity is noticed for group as shown in Fig. 5. It is also noticed from Fig. 5 that
the normal and mild polyp compared to moderate, severe the diversity among polyps was larger than for the malignant
polyps, and the cancer groups Table I. and the controls. This is mainly notable between the mild
Previous works have shown that the band at 1121 cm1 and moderate polyps where they appear to overlap.
arises from RNA absorbance, whereas the 1015 cm1 shoul-
der is due to DNA.30,31 It was found that the best discrimi- III.C. LDA classification
nating values were obtained by deriving the intensity ratio of
PCA is a mathematical algorithm that reduces the large
these two vibrational modes i.e., A1121 / A1015 as can be seen
dimension of the measured spectrum that is being dealt with,
in Table I.
i.e., instead of using many variables, the variability in the
data is described by only few PCs. The reduction is achieved
III.B. Grading the samples using DCF by finding the correlation between the variables. Figure 6
Although the normal, benign polyps, and malignant tis- shows the scores of PC1 versus PC2 for all the measure-
sues constitute three separate main groups, an interesting ments. It can be seen that all normal data points are com-
analysis would be to examine a possible digital grading of pletely separated from carcinoma, while some overlap ap-
the tissues based on a chosen set of biomarkers. Based on pears between the polyps and the carcinoma and between the
these biomarkers, an acuteness ladder could be formed and polyps and normal tissue. PC1 contributes almost solely to
the groups can be classified. Each case was characterized the separation between normal and carcinoma groups, while
using an array of biomarkers, which were arranged as fol- PC2 contributes mainly to the separation between the polyps
lows: and the other two groups. This partial separation obtained by
PCA is not satisfactory and further procedures should be


A2848/A2958 carried out in order to distinguish between all five groups.
A1740 Thus, LDA was applied to discern between the five groups.
. 5 LDA is a statistical multivariate supervised method. It
A1083/A1056
searches for the variables containing the largest and the
A1021/A1015
smallest interclass variances, and constructs a linear combi-
To further examine the gradual spectral changes encountered nation of the variables to discriminate between classes. The
in the above tissue samples, we utilized the discriminant rule is to construct a training set of samples, which is further
classification function. This statistical tool enables to im- tested using the test set. The large number of valid variables
prove discrimination among normal, polyp stages, and ma- in the infrared spectra is an obstacle for this approach, which
lignancy by representing an adequate quantitative follow up needs more observations than variables. Using PCA unsu-
of transformations versus group type. DCF generates a clas- pervised method prior to the LDA analysis was helpful in
sification score for each group or premalignant stage using a reaching this goal of variable reduction. The results from the
linear combination of a previously derived array of biomar- LDA iteration are summarized in Table II, which shows the
kers with weight coefficients.23 Figure 5 shows the scores of percentage of success of each data set within all the possible

Medical Physics, Vol. 37, No. 3, March 2010


1053 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1053

FIG. 6. PCA model employed on the database reducing the 512 valid measured variables in the spectrum to 13 PCs, which describe 98.4% of the data
variance. Full separation was achieved between the normal group plus symbols and the malignant group squares, while partial overlaps exist between the
three polyp groups circles. The solid black circles represent the corresponding groups centroids.

groups included in the study. We performed the LDA analy- in order to examine the classification potential of this optical
sis with two strategies: First, where all five groups were in- methodology in tandem with advanced statistical techniques.
cluded Table IIa and the second with only three groups, The gradual changes shown by the DCF score Fig. 5
namely, normal, polyp, and cancer, where the polyp group present a digital illustration how benign polyps evolve to-
consists of all polyp subgroups mild, moderate, and severe ward carcinoma. This trend was further studied using spe-
Table IIb. The results when all the five groups are in- cific biomarkers which clearly verify this gradual transition.
cluded show relatively lower success rates indicating that The main biomarker that dictates the DCF fitting and fully
many cases were classified within the neighboring groups. shows the gradual behavior is the CH2 / CH3 ratio
This picture dramatically improved when only three groups A2848 / A2958. This is due to its extraordinary high t-value
were assumed, where most of the group members were clas- Table I and the highly ordered gradual absorbance intensity
sified correctly. In both cases, none of the normal was as- of the normal, polyps, and carcinoma groups Fig. 3. These
signed as cancer and vice versa. The largest misidentification results suggest that the lipid/protein ratio gradually increases
occurred between normal and mild polyp 24%, which is with the severity of the disease. Since proteins contain, on
frequently a difficult problem even for an expert pathologist, the average, an equal amount of methyl and methylene
since mild polyps are intrinsically very close in their cell groups, a protein change alone should modify the CH3
morphology to normal tissues. This misidentification im- stretching as well as the CH2 stretching to the same extent.
proves dramatically when normal is misidentified as moder- The precise origin of the increase in the methyl/methylene
ate or severe polyps with only 2% and 1%, respectively. ratio remains to be determined; it may arise from an increase
When all three polyp stages are treated as a single class, the in lipid content, but can also be associated with the modifi-
percentage of normal misidentified as polyp is reduced to cation of the membrane composition during cancer.
14% Table IIb. In our previous studies, changes in the lipid region were
It is encouraging that in both strategies the false negative described differently.32 We believe that the main reason for
and the false positive rates both remain at 0%, as can be seen this is the normalization procedure in each case study. Our
from Tables IIa and IIb. previous study in this subject used min/max normalization of
the entire measured spectra with respect to the amide I in-
tensive band. This approach cannot detect subtle changes in
IV. DISCUSSION the lipid region. In order to reduce the contribution of close
Previous studies have provided evidence that infrared bands and to focus on the relevant regions, we now use a
spectroscopy is a useful and powerful tool to classify tissues different approach by bisecting the entire spectrum to three
and cells. The aim of this work was to test the potential of regions and vector normalized each segment separately as
FTIR spectroscopy on colon cancer patients where five dif- explained in the experimental section Fig. 2. Another ben-
ferent tissue groups stages can be identified. We used a efit over our previous normalization technique is the removal
statistical approach to analyze the large database of IR spec- of the uninformative region of 1800 2800 cm1, which is
tra that was measured. The complete database was analyzed highly dependent on the CO2 surrounding of the measured

Medical Physics, Vol. 37, No. 3, March 2010


1054 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1054

TABLE II. True/false identification percentage of each tissue type based on ment may arise from the fact that phosphate level is the sum
the averaged LDA iterations a assuming five groupsNormal, mild, mod- of a larger number of biomolecules containing phosphate
erate, severe polyps, and cancer b assuming only three groups where all
polyp subgroups are treated as a single polyp group.
groups.
The shoulder at 1740 cm1 can be assigned to the ester
Identified as Normal Mild Moderate Severe Carcinoma C v O stretching of phospholipids, not present in DNA and
Polyp Polyp Polyp
proteins. The decreased intensities at 1740 are also consistent
Type with the methyl/methylene ratio, except for the cancer group
that shows a higher value than the moderate and severe pol-
Normal 74 24 2 1 0
yps Table I. The 1740 cm1 band is composed of residues
Mild 6 63 14 12 5 (a) of other vibrational modes35 besides phospholipids that may
Moderate 1 22 60 15 2
cause for this inconsistency.
Our LDA results Table II indicate that a high discrimi-
Severe 2 7 7 66 18
nation percentage is achieved when dealing only with the
Carcinoma 0 1 2 17 80 three main groups Table IIb: Normal, polyps, and cancer
tissues. A significant but relatively low percentage is ob-
tained when considering all the five groups: Normal tissue,
Identified as Normal Polyps Carcinoma mild, moderate, advanced polyps, and cancer Table IIa.
False identification can result from several reasons: Built-in
Type
spatial36 and spectral resolution limitation of the FTIR to
(b) distinguish between proximate tissues; cytological identifica-
Normal 86 14 0
tion failureEspecially when dealing within the polyp sub-
Polyps 6 85 9 groups; averaging over different measured sites inside spe-
cific sample can lead to realistic uncertainty of the sample
Carcinoma 0 15 85
type since some regions show borderline behavior; and dam-
aged FTIR measurement due to unstable surrounding and
equipments. All the above reasons except the first confound-
ing factor can be eliminated by consistently acquiring a
tissues. This is especially important when applying math- larger database.
ematical based distinction algorithms such as PCA where it The t-test values Table I revel that besides methyl/
could lead to artifacts in the classification. The two different methylene ratio, the chosen biomarkers have almost no sig-
normalization approaches reveal that different spectral pre- nificant differences between carcinoma severe polyp and
processing techniques may alter the biochemical interpreta- moderate polyp groups. Although the LDA algorithm takes
tion. This bisect technique was widely used,20,33,34 where into account all wavenumbers, this result is also true for the
sections of the spectrum were cut, baseline corrected, and severe polyp case Table IIa. This leads to mixing between
normalized independently from the entire region of the spec- the severe polyp and carcinoma groups. In contrast, the DCF
trum. classification Fig. 5 shows remarkably different scores be-
In complex systems such as tissues, the main absorptions tween carcinoma and severe polyps. DCF weights are influ-
arise from N u H, C v O, C u H, and P v O bonds from enced mainly by the t-test values. Thus, the methyl/
proteins, lipids, and nucleic acids present in the cells. methylene ratio, which has the largest t-value, dictates the
Wavenumbers below 1800 cm1 constitute prominent re- DCF behavior. When comparing the LDA results just for the
gions that contain all the above vibrational modes. This re- normal and malignant samples, excellent separation is
gion shows remarkable differences among normal tissues, achieved in the t-test analysis Table I as well as in the DCF
cancer, and all three polyp groups. This is strongly ostensible scores Fig. 5.
in the 1045 cm1 region, where a decreased absorption is In summary, we conclude that infrared spectroscopy is a
noted for all groups compared to the normal. The normal useful tool to identify different types of colonic tissues. The
cases clearly stand above all pathological tissues polyp and DCF scores formed acuteness ladders, which give further
cancer. This can be explained by the substantial glycogen benefits to the ability of grading the samples in the correct
reduction consumption in the polyps and the cancer tissues. order, namely, normal, polyps, and cancer based on the pre-
Another remarkable different biomarker is the 1072 cm1 viously selected array of biomarkers. We also demonstrated
band that corresponds to carbohydrates. This biomarker de- that PCA combined with LDA is a powerful tool for inves-
creased in the normal group. tigating the global biochemical modifications responsible for
The antisymmetric phosphate levels 1170 1310 cm1 tissue classification. However, we do not claim to replace the
reveal the metabolic turnover, as it consists of energy pro- pathologist. Spectroscopic methods may provide a second
ducers such as ATP and GTP, and other biomolecular com- opinionEspecially in difficult cases where ambiguous as-
ponents which include phospholipids, nucleic acids DNA signments are given by histopathology.
and RNA, and phosphorylated proteins. The difference in We have shown that different normalization techniques
total phosphate level among normal, the three groups of pol- can change the biochemical interpretation, although not the
yps, and cancer was clear in the normal case. This enhance- total changes among spectra.

Medical Physics, Vol. 37, No. 3, March 2010


1055 Zwielly et al.: Statistical techniques applied to FTIR colonic spectra 1055

Early cancer detection is vital in all cancer types but this near-infrared optical properties using diffuse optical spectroscopy, J.
Biomed. Opt. 131, 014016 2008.
is especially true in colon cancer, where removing the pre- 14
P. Lasch, M. Diem, W. Hansch, and D. Naumann, Artificial neural net-
malignant tissue can save lives. Due to the still mysterious works as supervised techniques for FT-IR microspectroscopic imaging,
nature of how polyps progress spontaneously toward carci- J. Chemom. 20, 209220 2006.
15
noma, further studies which examine in more detail the po- P. Lasch, J. Schmitt, and D. Naumann, Colorectal adenocarcinoma diag-
tential of IR spectroscopy may shed more light on these nosis by FT-IR microspectrometry, Biomed. Spectroscopy 3918, 4556
2000.
changes. 16
Z. Hammody, S. Argov, R. K. Sahu, E. Cagnano, R. Moreh, and S.
Mordechai, Distinction of malignant melanoma and epidermis using IR
ACKNOWLEDGMENTS micro- spectroscopy and statistical methods, Analyst Cambridge, U.K.
1333, 372378 2008.
This work was supported in part by the Israel Science 17
M. S. Cappell, From colonic polyps to colon cancer: Pathophysiology,
Foundation ISF Grant No. 788/01, and the Cancer Re- clinical presentation, screening and colonoscopic therapy, Minerva Gas-
search Foundation in Memory of Professor Tabb at the So- troenterol. Dietol 534, 351373 2007.
18
S. Argov, J. Ramesh, A. Salman, I. Sinelnikov, J. Goldstein, H. Guter-
roka University Medical Center.
man, and S. Mordechai, Diagnostic potential of Fourier-transform infra-
a
red microspectroscopy and advanced computational methods in colon
Author to whom correspondence should be addressed. Electronic mail: cancer patients, J. Biomed. Opt. 7, 248254 2002.
shaulm@bgu.ac.il; Telephone: 972-8-646 1749; Fax: 972-8-647 19
MATLAB, Version 7.0 R14, The MathWorks Inc. Natick, MA 2007.
2924. 20
A. Zwielly, J. Gopas, G. Brkic, and S. Mordechai, Detection of a drug-
1
Cancer, World Health Organization February 2006. Retrieved on 24 resistant human melanoma cell line using FTIR Spectroscopy, Analyst
May 2007, http://www.cancerinfodirect.com/colon-cancer. Cambridge, U.K. 134, 294300 2009.
2
R. K. Sahu and S. Mordechai, Fourier transform infrared spectroscopy in 21
M. Diem, P. Griffith, and J. Chalmers, Vibrational Spectroscopy for Medi-
cancer detection, Future Oncol. Oct. 15, 635647 2005. cal Diagnosis Wiley, New York, 2008.
3
G. Herzberg, Molecular Spectra and Molecular Structure. II Infrared and 22
R. A. Fisher, The use of multiple measures in taxonomic problems,
Raman Spectra of Polyatomic Molecules Van Nostrand Reinhold, New
Ann. Eugen. 7, 179188 1936.
York, 1945. 23
4 C. Huberty, Applied Discriminant Analysis Wiley, New York, 1994.
P. Lasch and J. Kneipp, Biomedical Vibrational Spectroscopy Wiley, 24
K. Fukunaga, Introduction to Statistical Pattern Recognition Academic,
Hoboken, 2008.
5 San Diego, 1990.
T. Gao, J. Feng, and Y. Ci, Human breast carcinomal tissues display 25
H. Gremlich and B. Yang, Infrared and Raman Spectroscopy of Biologi-
distinctive FTIR spectra: Implication for the histological characterization
cal Materials Dekker, New York, 2001, pp. 421475.
of carcinomas, Anal Cell. Pathol. 18, 8793 1999. 26
6 M. Diem, S. Boydston-White, and L. Chiriboga, Infrared spectroscopy
R. Sahu, U. Zelig, M. Huleihel, N. Brosh, M. Talyshinsky, M. Ben-
Harosh, S. Mordechai, and J. Kapelushnik, Continuous monitoring of of cells and tissues: Shining light onto a novel subject, Appl. Spectrosc.
WBC biochemistry in an adult leukemia patient using advanced FTIR- 53, 148161 1999.
27
spectroscopy, Leuk. Res. 30, 687693 2006. H. H. Mantsch and D. Chapman, Infrared Spectroscopy of Biomolecules
7
A. Podshyvalov, R. K. Sahu, S. Mark, K. Kantarovich, H. Guterman, J. Wiley, New York, 1996.
28
Goldstein, R. Jagannathan, S. Argov, and S. Mordechai, Distinction of J. Liquier and E. Taillandier, in Infrared Spectroscopy of Biomolecules,
cervical cancer biopsies by use of infrared microspectroscopy and proba- edited by H. H. Mantsch and D. Chapman Wiley, New York, 1996, pp.
bilistic neural networks, Appl. Opt. 4418, 37253734 2005. 131158.
29
8
A. Tfayli, O. Piot, A. Durlach, A. Bernard, and M. Manfait, Discrimi- S. Wartewig, IR and Raman Spectroscopy Wiley, New York, 2003, pp.
nating nevus and melanoma on paraffin-embedded skin biopsies using 75124.
30
FTIR microspectroscopy, Biochim. Biophys. Acta 17243, 262269 D. Naumann, FT-infrared and FT-Raman spectroscopy in biomedical
2005. research, Appl. Spectrosc. Rev. 36, 239298 2001.
9 31
C. Krafft, L. Shapoval, S. B. Sobottka, K. D. Geiger, G. Schackert, and R. F. S. Parker, Application of Infrared Spectroscopy in Biochemistry, Biol-
Salzer, Identification of primary tumors of brain metastases by SIMCA ogy and Medicine Plenum, New York, 1971.
32
classification of IR spectroscopic images, Biochim. Biophys. Acta A. Salman, S. Argov, J. Ramesh, J. Goldstein, S. Igor, H. Guterman, S.
17587, 883891 2006. Mordechai, FTIR microscopic characterization of normal and malignant
10
E. Gazi, M. Baker, J. Dwyer, N. P. Lockyer, P. Gardner, J. H. Shanks, R. human colonic tissues, Cell. Mol. Biol. Paris 4722, 159166 2001.
33
S. Reeve, C. A. Hart, N. W. Clarke, and M. D. Brown, A correlation of P. G. Andrus, Cancer monitoring by FTIR spectroscopy, Technol. Can-
FTIR spectra derived from prostate cancer biopsies with gleason grade cer Res. Treat. 52, 157167 2006.
34
and tumour stage, Eur. Urol. 504, 750761 2006. V. R. Kondepati, M. Keese, H. M. Heise, and J. Backhaus, Detection of
11
P. Bruni, C. Conti, E. Giorgini, M. Pisani, C. Rubini, and G. Tosi, His- structural disorders in pancreatic tumour DNA with Fourier-transform
tological and microscopy FT-IR imaging study on the proliferative activ- infrared spectroscopy, Vib. Spectrosc. 40, 3339 2006.
35
ity and angiogenesis in head and neck tumours, Faraday Discuss. 126, Z. Movasaghi, S. Rehman, and I. ur Rehman, Fourier transform infrared
1926 2004. FTIR spectroscopy of biological tissues, Appl. Spectrosc. Rev. 43,
12
D. R. Shankaran and N. Miura, Trends in interfacial design for surface 134179 2008.
36
plasmon resonance based immunoassays, J. Phys. D: Appl. Phys. 40, C. Krafft, D. Codrich, G. Pelizzo, V. Sergo, Raman and FTIR micro-
71877200 2007. scopic imaging of colon tissue: A comparative study, J. Biophotonics-
13
S. H. Tseng, A. Grant, and A. J. Durkin, In vivo determination of skin May 12, 154169 2008.

Medical Physics, Vol. 37, No. 3, March 2010

You might also like