You are on page 1of 6

Automated classification of bone marrow cells in microscopic images

for diagnosis of leukemia: A comparison of two classification schemes


with respect to the segmentation quality
Sebastian Krappe1, Michaela Benz1, Thomas Wittenberg1,
Torsten Haferlach2, Christian Mnzenmayer1
1
Image Processing and Medical Engineering Department,
Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany;
2
MLL Munich Leukemia Laboratory, Munich, Germany

ABSTRACT
The morphological analysis of bone marrow smears is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually with the use of bright field microscope. This is a time consuming, partly subjective and tedious process. Furthermore, repeated examinations of a slide
yield intra- and inter-observer variances. For this reason an automation of morphological bone marrow analysis is pursued. This analysis comprises several steps: image acquisition and smear detection, cell localization and segmentation,
feature extraction and cell classification. The automated classification of bone marrow cells is depending on the automated cell segmentation and the choice of adequate features extracted from different parts of the cell. In this work we focus
on the evaluation of support vector machines (SVMs) and random forests (RFs) for the differentiation of bone marrow
cells in 16 different classes, including immature and abnormal cell classes. Data sets of different segmentation quality
are used to test the two approaches. Automated solutions for the morphological analysis for bone marrow smears could
use such a classifier to pre-classify bone marrow cells and thereby shortening the examination duration.
Keywords: automated classification, bone marrow cells, diagnosis of leukemia, support vector machines, random
forests, segmentation quality

1. INTRODUCTION
The morphological analysis of bone marrow slides is fundamental for the diagnosis of leukemia. This cytological examination serves as clarification of variations in a blood smear differential. It is also used for the clarification of anemia, as a
means to exclude the affection of the bone marrow by a lymphoma, and at suspicion of leukemia. The morphological
evaluation of bone marrow cells is the basis for a patients diagnosis and for decision support for a consequent treatment.
For the conventional cytological analysis the bone marrow aspirate smear is stained and examined by means of a light
microscope. At first the cell density, the bone marrow fat content and qualitative changes of the cells are observed in a
mid-level (e.g. 5-fold) magnification. Afterwards, cells of different types are identified and counted. This step is time
consuming, partly subjective, error-prone and tedious. Furthermore, repeated examinations of a slide may yield intraand inter-observer variances. For that reason an automation of the bone marrow classification is pursued. Difficulties and
challenges of automated image-based analysis of bone marrow samples are the high staining variability, the diversity of
the smear quality of the samples, and especially the segmentation of cells in clusters and the challenging differentiation
of immature cells. The analysis pipeline comprises several steps: image acquisition and smear detection, cell localization
and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is strongly depending on the automated segmentation and the choice of adequate features which are extracted from different parts
of the cell, such as the whole cell, the cell plasma and the cell nucleus.
There exist only a few publications on the topic of automated bone marrow analysis. Wu et al. propose a multispectral
imaging approach for the analysis of blood and bone marrow images [1]. Osowski et al. present the application of a genetic algorithm and a support vector machine for the recognition of bone marrow cells [2]. Theera-Umpon et al. propose
morphological granulometric features to characterize nuclei for bone marrow cell classification [3]. Up to now neither a
prototype nor a commercial product on the market is capable of a fully automated morphological bone marrow analysis.
Medical Imaging 2015: Computer-Aided Diagnosis, edited by Lubomir M. Hadjiiski, Georgia D. Tourassi,
Proc. of SPIE Vol. 9414, 94143I 2015 SPIE CCC code: 1605-7422/15/$18 doi: 10.1117/12.2081946

Proc. of SPIE Vol. 9414 94143I-1


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

In this work we focus on the evaluation of two classification schemes (support vector machines and random forests) for
the differentiation of bone marrow cells in 16 different classes including immature and abnormal cell classes. Data sets
of different segmentation quality are used to test these two approaches. Automated solutions for the morphological analysis of bone marrow smears could potentially apply such a classifier to pre-classify bone marrow cells and thereby shortening the examination time.

2. MATERIALS AND METHODS

Image
Acquisition and
Smear
Detection

Cell Localization
and
Segmentation

Feature
Extraction

Classification

Cell Class
Band Neutrophils
Segmented
Neutrophils
Lymphocytes
Monocytes
Eosinophils
Basophils
Metamyelocytes
Myelocytes
Promyelocytes
Blasts
Plasma Cells
Proerythroblasts
Erythroblasts
Normoblasts
Hairy cells
immature
Lymphocytes

Percentage
6.8
21.2
17.1
2.5
3.8
0.2
1.8
3.9
7.5
8.6
7.1
1.7
1.7
15.9
0.3
0.2

Fig. 1: Bone marrow cell recognition workflow

An overview of workflow from the bone marrow smear on a microscopic slide to an automatic determination of the cell
class distribution is depicted in Fig. 1. High resolution microscopic bone marrow images are acquired in relevant regions
of the slide (Section 2.1). For a captured image the cell centers are determined and used as seed information for the segmentation of the nucleus und plasma parts of each cell (Section 2.2). After the segmentation step, each cell is characterized by a variety of features (Section 2.3) which are used to solve the 16-class bone marrow classification problem (Section 2.4). In the following sections the single steps of the image processing pipeline are explained in detail.
2.1 Image Acquisition and Smear Detection
Bone marrow smears are digitized with an automated microscope in several steps. At first the complete slide is captured
in low magnification (1-fold magnification) to obtain an overview image. For an automatic system and to minimize the
scanning duration, detection of the bone marrow smear on the microscopic slide is necessary. The contour of the smear is
identified by a combination of thresholding and k-means clustering methods. In order to include regions at the boundary
of the smear the convex hull is used. Then the bone marrow smear region is determined and digitized in a mid-level (5fold) magnification. Relevant regions are selected and scanned in high magnification (40-fold magnification with oil
immersion) for the morphological cell analysis. All images used for the evaluation were captured with a CCD-Camera
mounted on a bright-field microscope (Zeiss Axio Imager Z2). The dimensions of the original images are 2452 2056
pixel and the pixel size of the camera is 3.45 3.45 m.
2.2 Cell Localization and Segmentation
For the segmentation of single cells a Fast-Marching approach [4] has been extended by a different determination of
potential cell centers [5]. These seed points are then used for the further cell separation. Each segmented cell is
afterwards divided into nucleus and plasma parts by applying a threshold to the color transformed image of the whole
cell.

Proc. of SPIE Vol. 9414 94143I-2


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

2.3 Feature Extraction


The extracted regions of interest (nucleus and whole cell) of different cell types are characterized by shape, texture and
color features. For the characterization of the considered 16 bone marrow cell types various shape features are used:
Area, Zernike moments, normalized central moments, and Hus seven invariant moments. The texture of a relevant cell
region is described by numerous texture features: first and second order statistical features, color enhanced second order
statistical features, features for the characterization of the heterogeneity and granularity, statistical geometric features,
gray level run length based features, granulometric features, textural features corresponding to visual properties of
texture and the fractal dimension. The color components of the cells is described by RGB histogram statistic features,
central moments in the RGB and HSV color spaces and different moments in the HSV color space.
2.4 Classification
The feature selection and classification task is obtained by two different classification schemes which are evaluated and
compare, namely support vector machines (SVMs) and random forests (RFs) The first scheme (SVMs) uses 16 twoclass classifiers (one-vs.-rest respectively) to determine the class for unseen data. For each 2-class classifier class-specific features are selected by a forward selection procedure. In the next step, these features are used for the training of the
individual support vector machines. A feature vector is assigned by means of the 16 classifiers to the class with the highest class probability.
The second approach is to apply decision trees for the classification of bone marrow cells. For this evaluation random
forests have been employed for the differentiation of all classes at once. With random forests the feature selection is done
in the process of building the classifier. The minimum sample count required at a leaf node for it to be a split is set to 10
and the maximum number of trees in the forest is set to 200.

,i

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 2: representative images for different segmentation qualities: (a) and (b) represent good segmentations (the contour of the
whole cell is largely correct, the nucleus segmentation is partly at bit too small); (c) and (d) stand for acceptable segmentations
(segmentation of nucleus is identical with segmentation of whole cell (c) or segmentation of nucleus is correct and segmentation of
whole cell leaks a bit (d)); (e) and (f) represent deformed segmentations (cell nucleus is only detected (e); segmentation of the whole
cell leaks (f)); (g) and (h) stand for erroneous segmentations (segmentation of the whole cell touches image border or includes
neighboring cells).

Proc. of SPIE Vol. 9414 94143I-3


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

3. RESULTS
3.1 Cell Segmentation
In order to obtain training and test sets for the classification task the quality of the automatic cell segmentation procedure
was evaluated visually for more than 150,000 manually classified cells by a human expert. Each segmented cell was manually assigned to one of four quality levels: good, acceptable, deformed or erroneous segmentation (see. Fig.
2). The percentage segmentation quality distribution per cell class and the cell count distribution of the analyzed 16 classes are depicted in Fig. 3. The ratio of good segmented cells differs among the cell classes. The best segmentation results
was achieved for basophils and immature Lymphocytes, cf. Fig. 2 top row and Fig. 3 bottom left. The classes with the
smallest ratio of good segmented cells are the segmented and band neutrophils, cf. Fig. 2 bottom row and Fig. 3 top left.

32640

Segmented
7732

Band Neutrophils

12551

Promyelocytes

14428

Blasts
8394

Plasma Cells
2270

Metamyelocytes

5040

Myelocytes

3065

Monocytes

29708

Normoblasts
355

Hairy cells
Lymphocytes

30261

Eosinophils

3443

Proerythroblasts

1686

Erythroblasts

0%

20%

40%

60%

80%

2117

Basophils

266

immature

118

100%

good segmentation

acceptable segmentation

deformed segmentation

erroneous segmentation

5000

10000

15000

20000

25000

30000

35000

Fig. 3: Left: percentage segmentation quality distribution per cell class. Classes are sorted according to ascending good segmentation
percentage Right: cell count distribution for the 16 bone marrow cell classes.

3.2 Classification
For the training step of the two classifiers a set of 10,269 automatically segmented cells of good segmentation quality
have been used. These cells were collected from 479 different bone marrow samples acquired from routine examinations
in the Munich Leukemia Laboratory (MLL). For each cell a set of 1,330 features mentioned in Section 2.3 has been
extracted for different cell parts (nucleus and whole cell) and used to build the classifier. The remaining 140,000 cells
were used for the testing step and were grouped into the four quality classes. The SVM classification scheme and the
random forests were applied to evaluate datasets of different segmentation quality. For 52,991 cells of good

Proc. of SPIE Vol. 9414 94143I-4


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

segmentation quality from the test data set an average classification rate of 64% was achieved for the 16-class classification problem with SVMs and 69% with random forests. These cells for the test set have been collected from 850 different slides, also obtained from the MLL. The average classification rate for the 25,044 cells of acceptable segmentation
quality extracted from 839 slides is 44 % with the SVM framework and 52% with random forests. For the 29,652 deformed cell segmentations which were collected from 825 slides the average classification rate is 27% with SVM and 40%
with random forests. The average classification rate for cells of one of the three segmentation qualities is also evaluated
with both classifiers for each class. With the random forest the average classification rate for such cells is 57%, with the
SVM framework at true positive rate of 49 % is achieved. The classification rates for the 16 different classes are visualized in Fig. 4.
0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Band Neutrophils
Segmented
Neutrophils
Lymphocytes
Monocytes
Eosinophils
Basophils
Metamyelocytes
Myelocytes
Promyelocytes
Blasts
Plasma Cells
Proerythroblasts
Erythroblasts
Normoblasts
Hairy cells
immature
Lymphocytes
good segmentations + SVM

good segmentations + Random Forest

acceptable segmentations + SVM

acceptable segmentations + Random Forest

deformed segmentations + SVM

deformed segmentations + Random Forest

good, acceptable and deformed segmentations + SVM

good, acceptable and deformed segmentations + Random Forest

Fig. 4: Average classification rates for the 16 different bone marrow cell classes in %

Proc. of SPIE Vol. 9414 94143I-5


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

100%

4. DISCUSSION
In this paper we have focused on the automatic classification of bone marrow cells in microscopic images for leukemia
diagnosis. Two classification schemes were evaluated on data sets of different segmentation quality. The results show
that a better cell segmentation quality yields to better classification rates in the majority of cases. Random forests can be
applied successfully for the bone marrow classification task. For the tested data set the classification rates of random
forests are higher than the rates of the SVM classifier framework for each segmentation quality level.
Next research activities will include the evaluation of classification trees with the incorporation of more expert
knowledge, the quality improvement of the overall automatic segmentation and the evaluation of different bone marrow
specific features.

ACKNOWLEDGMENTS
This work was funded through the AutoMorLeu project from the German Federal Ministry of Education and Research and through
the MAVO-project MultiNaBel from the Fraunhofer-Gesellschaft.

REFERENCES
[1] Wu, Q., Zeng, L., Ke, H., Xie, W., Zheng, H., Zhang, Y., "Analysis of blood and bone marrow smears using multispectral imaging analysis techniques," Proc. SPIE 5747, 1872-1882 (2005)
[2] Osowski, S., Siroic, R., Markiewicz, T., Siwek, K., "Application of Support Vector Machine and Genetic Algorithm
for Improved Blood Cell Recognition," IEEE Transactions on Instrumentation and Measurement 58(7), 2159-2168
(2009)
[3] Theera-Umpon, N., Dhompongsa, S., "Morphological granulometric features of nucleus in automatic bone marrow
white blood cell classification," IEEE Trans Inf Technol Biomed 11(3), 353-359 (2007)
[4] Zerfass, T., Halmeyer, E, Schlarb, T, Elter, M., "Segmentation of leukocyte cells in bone marrow smears," Computer-Based Medical Systems (CBMS), 267-272 (2010)
[5] Krappe, S., Macijewski, K., Eismann, E., Ziegler, T., Wittenberg, T., Haferlach, T., Mnzenmayer, C., "Lokalisierung von Knochenmarkzellen fr die automatisierte, morphologische Analyse von Knochenmarkprparaten," Bildverarbeitung fr die Medizin 2014, 403-408 (2014)

Proc. of SPIE Vol. 9414 94143I-6


Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

You might also like