You are on page 1of 4

LOCAL INTENSITY MODEL: AN OUTLIER DETECTION FRAMEWORK WITH

APPLICATIONS TO WHITE MATTER HYPERINTENSITY SEGMENTATION


Parnesh Raniga a, Pierre Schmitt a,b, Pierrick Bourgeat a, Jurgen Fripp a,
Victor L. Villemagne c,d, Christopher C. Rowe d, Olivier Salvado a
a

CSIRO Preventative Health National Research Flagship ICTC, The Australian e-Health Research Centre-BioMedIA, Royal
Brisbane and Women's Hospital, Herston, QLD, Australia.
b
Ecole Nationale Suprieure de Tlcommunications, Paris, France
c
The Mental Health Research Institute, University of Melbourne, Parkville, VIC, Australia
d
Department of Nuclear Medicine and Centre for PET, and Department of Medicine, University of Melbourne, Austin
Hospital, Melbourne, VIC, Australia

ABSTRACT
Automatic segmentation of white matter hyperintensities
(WMH) from T2-Weighted and FLAIR MRI is a common
task that needs to be performed in the analysis of many
different diseases. A method to segment the WMH is
proposed whereby a local intensity model (LIM) of normal
tissue is generated. WMH are detected as outliers from this
model. The LIM enables an accurate modeling of intensity
variations thus reducing false positives. Moreover only
scans with normal tissues are required to create the model.
Twelve normal scans were used to generate the LIM and
validation was conducted on a set of 46 scans. Similarity
indices between the proposed approach and manual
segmentations were 0.590.15, 0.650.08 and 0.770.08 for
subjects with small, moderate and large volume of lesions
respectively. The proposed approach performed better than
support vector machines on the same dataset and compared
favorably to approaches in literature.
Index Terms White matter hyperintensities,
Alzheimers disease, local intensity mode, segmentation,
outlier detection
1. INTRODUCTION
Detection of deviations from the norm is one of the most
important applications of medical image analysis as such
deviations generally represent pathologies of interest. White
matter hyperintensities (WMH) are one such class of
pathologies that are present in several neurological
conditions including multiple sclerosis and Alzheimers
disease (AD). WMH can be distinguished from normal
appearing white matter (WM) due to their brighter
appearance on T2-weighted and fluid attenuated inversion
recovery (FLAIR) MRI.
The use of machine learning [1], [2] and pattern
recognition [3] methods have been proposed for the
segmentation of WMH. Machine learning techniques such
as support vector machines (SVM) require a large dataset of

978-1-4244-4128-0/11/$25.00 2011 IEEE

2057

manually segmented scans to be able to distinguish WMH


from normal WM. Pattern recognition methods such as
fuzzy c-mean (FCM) are sensitive to the relative clusters
sizes [4]. As the volume of lesions varies greatly, the FCM
algorithm can be run on a slice by slice basis to limit this
sensitivity. This may result in inconsistencies between
slices.
Furthermore, although the FLAIR sequence has been
shown to be the most sensitive at detecting lesions [5] the
brighter appearance of temporal and entorhinal cortices [6]
can result in a high degree of false positives. To reduce the
number of false positives, multiple MR sequences are
utilized.
In this paper we propose the segmentation of WMH
using an outlier detection framework with a local intensity
model (LIM). The method builds a LIM, a voxel by voxel
model of the normal appearance of FLAIR scans. A new
scan that is to be segmented is compared to this model and
all voxels that are deemed to be outliers are segmented as
WMH. This has the advantage that only scans without
pathology are required. By accounting for the brighter
intensity of the temporal and entorhinal cortices, the method
is able to reduce false positives.
We built a LIM using 12 cases and tested it on a dataset
of 46 scans. Furthermore we compared the proposed method
to segmentation using support vector machines [2].
2. MATERIALS AND METHODS
2.1. Data
MRI data from fifty eight participants from the Australian
Imaging Biomarkers and Lifestyle (AIBL) study [7] were
used in this study. Participants for this study included
healthy elderly subjects (n=44) as well as subjects with mild
cognitive impairment (n=10) and Alzheimers disease (n=4).
There were 26 males (Mean age: 73.1 6.2) and 32 females
(Mean age: 75.1 7.8) on the cohort.

ISBI 2011

T1-weighted magnetization prepared rapid gradient


echo (MPRAGE) MRI and Fluid Attenuated inversion
recovery (FLAIR) scans were acquired for all the subjects.
The scans were conducted on a three Tesla (3T) Siemens
Magnetom Trio scanner (Siemens, Germany). For the T1weighted images, the image size was 160 x 240 x 256
voxels with a voxel spacing of 1.2x1x1 mm in the sagittal,
coronal and axial directions respectively (TR = 2300ms, TE
= 2.98ms, flip angle = 9). For the FLAIR scans the image
size was 176x240x256 voxels and a voxel spacing of
0.90x0.97x0.97 mm (TR = 6000ms,TE = 421ms, flip angle
= 120, TI = 2100ms).
2.2. Method
We assume that a database of MRI scans is available with
enough healthy subjects, having no or very few lesions. For
this study we built the LIM using scans from 12 healthy
subjects with minimal WMH based on visual inspection.
The model was then used to test the algorithm on the
remaining 46 subjects from our dataset. The steps of the
algorithm were as follows:
1) Pre-processing: Both T1-weighted and FLAIR scans
were corrected for bias field effects [8]. FLAIR scans were
smoothed using anisotropic diffusion [9] and co-registered
to their corresponding T1-weighted scans using a rigid
transformation [10].
T1-weighted scans were segmented in gray matter
(GM), white matter (WM) and cerebrospinal fluid (CSF)
using an expectation maximization approach with prior
probabilities [11]. Intensity normalization of the images to a
randomly selected template from the training set was
performed by aligning the peaks of the CSF and WM
distribution in the individual images to the template. CSF
and WM masks from the T1-weighted segmentation were
propagated to the FLAIR space to generate the intensity
distribution of WM and CSF.
2) Co-registration to database specific elderly atlas: A
population specific atlas was generated using the T1weighted scans with an approach similar to Rohlfing et al
[12]. The atlas was generated using from 100 scans from the
AIBL study [7]. An intermediate average affine atlas was
generated by co-registering all T1-weighted scans to a
representative scan. All scans were then non-linearly
registered to this intermediate atlas using a mutual
information based free form deformation (FFD) algorithm
[13]. An average atlas was then generated and this was
repeated for five iterations. All the T1-weighted and FLAIR
scans were thus non-linearly registered to the elderly atlas.
3) Local intensity model: A model of the normal
distribution of tissue intensities at each voxel of the atlas
was generated from our training set. This was done by
generating a histogram of normal tissue intensities within a

2058

3x3x3 neighbourhood window around each voxel of each


scans. Dilated manual segmentations of WMH were used to
exclude WMH and reduce partial volume effects in the
generation of the LIM. Histograms were generated with 128
bins as this was found to give the best compromise between
memory usage and model accuracy. Histograms were also
normalized so that their bins summed to unity, i.e a
probability density distribution was generated.
4) Outlier detection: FLAIR scans were co-registered to
T1-weighted scans which in turn were non-linearly coregistered to the elderly atlas as explained above.
Correspondence established using the coregistrations was
used to compute the location of the relevant histograms in
the LIM. Detection of voxels as outliers from the LIM was
done as a twostep process.
i) WM voxels were marked as outlier if their intensity
was greater than Tch percent of voxels at the particular
location. Only bright voxels (WM mean + 3 standard
deviations) were considered.
ii) A degree of abnormality was computed for each
voxel detected as an outlier above by computing the
number of bins from the threshold Tch to the voxel
intensity. To generate a binary segmentation, a
threshold Td was applied to the degree of abnormality.
WM voxels were detected by using an average WM
segmentation in atlas space that excluded the brain stem and
cerebellar white matter. As post-processing, clusters of 10
voxels (computed using connected component analysis) or
less were removed.
2.3. Validation
The proposed method was validated against manual
segmentations. Manual segmentations of all the scans were
performed by P.R using MRIcro software. The test group
was split according to WMH volume (WMHV) as computed
from manual segmentations.
i) Large lesion volumes (LLV, n=18) (WMHV > 10ml).
ii) Moderate lesion volumes (MLV, n=18) (WMHV
10ml & WMHV > 3ml).
iii) Small lesion volumes (SLV, n=10) (WMHV 3ml).
Comparisons were conducted by computing the similarity
index (SI) as well as the overlap fraction (OF), extra fraction
(EF) and missed fraction (MF) [14]. The SI is equivalent to
the Dice coefficient [15]. These indices were computed
using:
 

    
  

 


  

Eq. 1



 
  
  
where TP is the number of true positive voxels, TN is the
number of true negative voxels, FP is the number of false
positive voxels and FN is the number of false negative
voxels.
The threshold parameters Tch and Td were estimated as
those that gave the best overall performance in terms of SI.
This was done using a grid search with a search range of 80
to 98% in steps of 2% for Tch and 1 to 10 bins for Td.
Furthermore, correlation between lesion volumes computed
from manual and those computed using the automatic
segmentations (using the above noted values for Tch and Td)
was conducted.
 

2.4. Comparison to SVM classifier


The proposed method was compared to the SVM method [2]
which was implemented using LIBSVM software [16]. The
SVM was trained using feature sets generated using the
manual segmentations. The feature set consisted of all voxel
intensities in a 3x3x3 neighbourhood as well as the x, y, z
coordinates in atlas space. The preprocessing steps used in
the proposed algorithm were also applied to the SVM
method. Furthermore, only voxels brighter than the WM
mean plus 3 standard deviations were considered.
A 5fold cross validation was performed on the dataset.
For each fold, 40,000 samples, consisting of equal number
of positive and negative samples, were used for the training.
The 5 SI for each of the cases were averaged.

The grid search revealed that the best overall performance


of the classifier was achieved at a Tch of 88% and a Td of 6.
The results of the method are presented in Table 1below.
Table 1. Results of the proposed method with Tch = 88%
and Td = 6.

MLV (3-10)
LLV (>10)

OF
()
0.58
0.16
0.68
0.13
0.69
0.11

EF
()
0.60
1.23
0.44
0.38
0.09
0.11

30

0
20

A
B
C
D
0
Fig. 1. Example of lesion detection with the described
method on a subject with large volume of lesions (top row)
and small volume of lesions (bottom row). An axial slice of
the FLAIR scan is presented in (A), the corresponding
manual segmentation in (B), the automated binary
segmentation with a Tch of 88% and a Td of 6 (C) and
corresponding distance map (D).
Table 2. Results of the SVM classifier.
Lesion Load
(mL)
SLV (0-3)
MLV (3-10)
LLV (>10)

3. RESULTS

Lesion Load
(mL)
SLV (0-3)

deviation was also higher. The SVM approach had a larger


OF and smaller MF but the much larger EF resulted in
worse SI.

MF
()
0.42
0.16
0.32
0.13
0.31
0.11

SI
()
0.59
0.15
0.65
0.08
0.77
0.08

An example of the segmentation achieved with the proposed


approach is presented in Fig. 1 below. As can be seen the
segmentations of the proposed approach are very similar to
manual segmentations.
The results of the SVM classifier are presented in
Table 2below. The similarity indices of the SVM approach
were lower than that of the LIM approach. The standard

2059

OF
()
0.67
0.14
0.71
0.12
0.69
0.13

EF
()
2.45
4.89
1.14
1.00
0.17
0.28

MF
()
0.33
0.14
0.29
0.12
0.31
0.13

SI
()
0.49
0.20
0.54
0.14
0.75
0.11

4. DISCUSSION
An automatic approach to segment WMH using just FLAIR
images was presented. Building a model of normal FLAIR
intensities allows for the accurate segmentation of WMH
while reducing false positives. The results of the proposed
method were similar to those in literature. Anbeek et al [1]
reported SI of 0.50 for small (largest lesion < 3mm in
diameter), 0.75 (largest lesion between 3- 10 mm in
diameter) for moderate and 0.85 for large lesions (largest
lesion > 10mm in diameter) using KNN classification
utilizing five different MR sequences. Dyrby et al [17]
reported SI of 0.45 0.15 for WMHV < 10ml 0.62 0.11
for WMHV 10-30 ml and 0.65 0.15 for WMHV > 30 ml
using a neural network and utilizing T1W, T2W and FLAIR
sequences. However the performance of Dyrbys classifier
deteriorated when only the FLAIR sequence was used (0.21
0.13, 0.47 0.11, 0.57 0.14).
As reported by other studies as well, the proposed
approach did tend to under segment lesions compared to
manual segmentations [3]. However the approach produced
less false positives (lower mean EF) compared to the SVM

approach. This was achieved by modelling at a local level


and thus being able to capture more local difference as
opposed to global modelling approach for SVMs. Therefore
the proposed algorithm is more specific but not as sensitive.
To allow more flexibility with the sensitivity and specificity,
the proposed method can be used in a semi-supervised
manner. This is the motivation behind having two
thresholds, namely Tch and Td. The Td threshold can be
chosen by an observer to best segment a particular case. In
this manner, by picking the Td that gave the best SI, mean
SIs for the three groups were increased to 0.80 0.06, 0.69
0.06 and 0.64 0.10 for subject with LLV, MLV and
SLM respectively.
The proposed approach requires only normal anatomy
for training. This is an advantage over traditional training
based classifiers which require manual segmentation of
pathology. Although we currently use manual segmentations
to exclude WMH voxels it would be possible to generate the
LIM without the need for manual segmentations if enough
normal scans are available. One approach could be to use
boot-strapping. By applying the outlier detection on the
training set, it may be possible to remove WMH until only
normal voxels were left. Moreover, the proposed approach
is general enough to be applied to the segmentation of other
pathologies.
5. CONCLUSION
Although the FLAIR sequence has been shown to the most
sensitive at detecting WMH, overlap in intensities between
GM regions and WMH results in false positives when
classifying WMH. The proposed method is able to reduce
these. Moreover, since an outlier based approach is used, the
LIM is built from only normal cases thus requiring minimal
or no manual segmentations, unlike machine learning
approaches. The approach performed better than SVM on
the same dataset and comparably to methods published in
the literature which used multiple modalities. The proposed
approach is promising not only for WMH detection but for
other medical imaging segmentation applications.
6. ACKNOWLEDGEMENTS
Data used in this article was obtained from the AIBL study
funded by the CSIRO (www.aibl.csiro.au).
7. REFERENCES
[1] P. Anbeek et al., Probabilistic segmentation of white matter
lesions in MR imaging, NeuroImage, vol. 21, no. 3, pp. 10371044, Mar. 2004.
[2] Z. Lao et al., Computer-Assisted Segmentation of White
Matter Lesions in 3D MR Images Using Support Vector Machine,
Academic Radiology, vol. 15, no. 3, pp. 300-313, Mar. 2008.
[3] F. Admiraal-Behloul et al., Fully automatic segmentation of

2060

white matter hyperintensities in MR images of the elderly,


NeuroImage, vol. 28, no. 3, pp. 607-617, Nov. 2005.
[4] J. C. Noordam et al., Multivariate image segmentation with
cluster size insensitive Fuzzy C-means, Chemometrics and
Intelligent Laboratory Systems, vol. 64, no. 1, pp. 65-78, Oct.
2002.
[5] P. Anbeek et al., Probabilistic segmentation of brain tissue in
MR imaging, NeuroImage, vol. 27, no. 4, pp. 795-804, Oct. 2005.
[6] T. Hirai et al., Limbic Lobe of the Human Brain: Evaluation
with Turbo Fluid-attenuated Inversion-Recovery MR Imaging1,
Radiology, vol. 215, no. 2, pp. 470-475, May. 2000.
[7] K. A. Ellis et al., The Australian Imaging, Biomarkers and
Lifestyle (AIBL) study of aging: methodology and baseline
characteristics of 1112 individuals recruited for a longitudinal
study of Alzheimer's disease, International Psychogeriatrics /
IPA, pp. 1-16, May. 2009.
[8] O. Salvado et al., Method to correct intensity inhomogeneity
in MR images for atherosclerosis characterization, IEEE
Transactions on Medical Imaging, vol. 25, no. 5, pp. 539-552,
May. 2006.
[9] P. Perona and J. Malik, Scale-space and edge detection using
anisotropic diffusion, IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 12, no. 7, pp. 629-639, 1990.
[10] S. Ourselin et al., Reconstructing a 3D structure from serial
histological sections, Image and Vision Computing, vol. 19, no. 1,
pp. 25-31, Jan. 2001.
[11] O. Acosta et al., Automated voxel-based 3D cortical
thickness measurement in a combined Lagrangian-Eulerian PDE
approach using partial volume maps, Medical Image Analysis,
vol. 13, no. 5, pp. 730-743, Oct. 2009.
[12] T. Rohlfing et al., Evaluation of atlas selection strategies for
atlas-based image segmentation with application to confocal
microscopy images of bee brains, NeuroImage, vol. 21, no. 4, pp.
1428-1442, Apr. 2004.
[13] D. Rueckert et al., Nonrigid registration using free-form
deformations: application to breast MR images, IEEE
Transactions on Medical Imaging, vol. 18, no. 8, pp. 712-721,
Aug. 1999.
[14] R. Stokking et al., Automatic Morphology-Based Brain
Segmentation (MBRASE) from MRI-T1 Data, NeuroImage, vol.
12, no. 6, pp. 726-738, Dec. 2000.
[15] L. R. Dice, Measures of the Amount of Ecologic Association
between Species, Ecology, vol. 26, no. 3, pp. 297-302, 1945.
[16] C. Chang and C. Lin, LIBSVM: a library for support vector
machines. 2001.
[17] T. B. Dyrby et al., Segmentation of age-related white matter
changes in a clinical multi-center study, NeuroImage, vol. 41, no.
2, pp. 335-345, 2008.

You might also like