You are on page 1of 6

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856

Brain MRI Segmentation with Label Propagation


1

Zhang Yong, 1Luo Weishi, 1Zhang Yang, 2Zhang Gang, 3Qian Dongxiang, 3Zhang Qi, 2Huang Ying, 2Wang Haifeng, 2Huang Xiaobo, 4Hong Jiaming
1

Department of Neurosurgery, Guangdong No.2 Provincial Peoples Hospital, Guangzhou, 510317, China 2 School of Automation, Guangdong University of Technology, Guangzhou, 510006, China; 3 Department of Neurosurgery, the Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510150, China 4 School of Medical Information Engineering, Guangzhou University of Chinese Medicine, Guangzhou, 510006, China Corresponding authors: Zhang Gang; Qian Dongxiang.

Abstract: Magnetic Resonance Imaging (MRI) is a widely


used method of medical imaging to study and diagnose brain diseases. Automated segmentation and analysis of MRI data is important for constructing computer aid diagnosis system (CAD). We propose a novel MRI segmentation method for precisely recognizing main parts of a brain structure and the probable lesion regions. The proposed method is based on label propagation mechanism, which is a semi-supervised learning method in machine learning literatures. The proposed method has less complexity and can make full use of prior domain knowledge compared to previous methods. Evaluation on a real data set illustrates its effectiveness.

recognition of lesion regions attracts much concern. But lesion regions may locate in different regions of brains and many of them do not have clear border against normal regions. Figure 1 shows three brain MRI images of three patients.

Figure 1. Three brain MRI images Keywords: brain MRI segmentation, label propagation, semi-supervised learning, support vector machine In medical engineering and machine learning literatures, there are many efforts that try to tackle the MRI automated segmentation problem, some of which attempt to face the aforementioned challenges. For the MRI image under sampled problem, image denoising methods have been introduced. For the regular region recognition problem, image aligning algorithms have been introduced to segment brain MRI images based on a set of preset templates. For lesion regions recognition problem, some methods based on fuzzy clustering have been proposed. However, a large number of these methods do not originate from MRI image analysis, and they would not be naturally suitable for MRI image analysis. Some of these methods directly applied computer graphical theory to perform template matching. Other methods adopted a supervised region generation scheme with manual seed point selection. Either kind of these methods does not capture the essence of brain MRI. From our observation, on one hand, brain MRI image has solid structures as mentioned above. And on the other hand, lesion regions may locate in these solid structures and have obvious visual features. Different from previous work, in this paper we propose a novel method to tackle the problem and address the challenges. We solve the target problem within a semisupervised learning framework. Generally speaking, we regard the segmentation problem as a label propagation problem from labeled pixel points to unlabeled pixel points. Label pixel points are obtained by a template Page 158

1. INTRODUCTION
Medical imaging plays an important role in diagnosis for many kinds of diseases. Magnetic Resonance Imaging (MRI) can generate colorful, contiguous images of inner structures within a human body. The medical information provided by MRI becomes an important clue for diseases diagnosis. In brain diseases diagnosis, MRI information can provide what happens in a patients brain. With the development of MRI technology, doctors can make diagnosis decisions and improve the accuracy of diagnosis based on MRI images. Since there are some obvious and potential principles of brain MRI images, it makes a computer aid diagnosis (CAD) system based on MRI possible, which reduces human efforts for recognizing, segmenting and classifying brain MRI images. However, to build an automated system for MRI segmentation poses at least the following challenges. First, most MRI images are under sampled to reduce the scan time. Hence MRI images are noisy which would greatly reduce the model accuracy. Second, brain MRI images have regular inner structures, i.e. gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF). The border of such regions may be rough, hence determined image segmentation methods may not work in this situation. Third, in brain disease diagnosis, Volume 2, Issue 5 September October 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
matching procedure between a test MRI and a template. Different labels would be transferred to neighbor points to get fuzzy regions. When a lesion region is met, a border would appear to mark the lesion region. Figure 2 sketches the main steps of this work. that both supervised and unsupervised information can be propagated between data samples. Brian et al. [5] proposed graph-based method for clustering. Different from previous clustering algorithm, their method performs a semi-supervised learning, using some labeled samples to improve the accuracy of clustering. The strategy to deal with labeled and unlabeled data samples directly motivated our semi-supervised pixel labeling method.

Figure 2 The main steps of this paper In Figure 2, Point Extraction stands for selecting the labeled points by comparing the given MRI image and a set of templates. Label Propagation stands for the pixellevel label procedure. Region Generation merges pixels according to their labels. During this procedure, lesion regions can be detected.

3 THE METHOD
3.1 Problem definition Before going further, we formally define the problem to be solved in this paper. Let be pixel matrix representing a MRI image associated with a category matrix of the same size. Each element of indicates the label of the pixel associated with the element. The total number of categories will be preset manually. The goal is to learn a set of that match the segmentation of a training set under some criteria. Some work tried to solve the problem as a matrix constraint optimization problem, which is a NP-hard problem and would meet local optima. We will propose a method to solve it which has close-form solution. 3.2 Template matching The goal of template matching is to determine the starting point set to perform label propagation, whose position is like the step of seed point selection. The difference is that the latter must be done per image manually. We set up a template set of normal brain MRI images and label the four concerning regions. The main steps are concluded as following: 1. Input a set of template images and an image 2. For each Do 3. 4. Compute the distance between and 5. End For 6. Find the maximal similarity between and , marked as 7. Pixel-wise comparison between and 8. Pick out the match points and their labels and put them in In line 3, an image align algorithm is used to find the most corresponding area between the test image and a template . We use the method proposed by Kevin et al. [6] to achieve the image alignment. In line 6, the similarity between two images is evaluated by a standard Euclidean distance function. In line 7, the comparison is performed by a pixel -wise threshold-based XOR operation, i.e. when two pixels are close to some extend (controlled by a threshold), the label and the pixel itself would be saved in . 3.3 Label propagation Label propagation is performed over to transfer the labels from labeled pixels to unlabeled pixels. To do this, Page 159

2. RELATED WORK
In this section we review some important work closely related to this paper, i.e. the study on brain MRI segmentation and label propagation in semi-supervised learning. For brain MRI segmentation, T.LOGESWARI et al. [2] proposed a clustering based MRI segmentation method through Self Organized Map. They also proposed a lesion tissue structure recognition algorithm based on fuzzy CMeans. Their algorithm adopted an unsupervised learning which cannot make use of prior knowledge. We do not think unsupervised learning can yield medically acceptable results. Jin et al. [3] proposed a metric learning algorithm to learn a weighted component-based tensor distance for DT-MRI segmentation. They applied a learned tensor distance instead of the widely used Euclidean distance in their work. They suggested that tensor distance can express the similarity between DTMRI images better than other distance functions, resulting in segmentation model of better performance. However, the tensor distance metric learning algorithm has high computational time and it relied too much on the quality of the initial training set, which limits its application in wide range of MRI analysis. For the application of label propagation in semisupervised learning, there was some exciting work reported in machine learning literatures. Rie et al. [4] proposed a method to design a kernel function for semisupervised learning. The key idea for their method is a spectral decomposition procedure which captures the main features of the kernel matrix. Their method combines labeled and unlabeled samples into a single kernel matrix. In fact, the similarity between labeled samples are forced to transfer to unlabeled ones, so as to obtain the optimal kernel function for current seen data set. Their work was a theoretical foundation of this paper Volume 2, Issue 5 September October 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
we first denote as two subsets, in which one is a set of labeled pixels and the other is pixels without labels, i.e. , where is a set of labeled samples and is a set of unlabeled samples. The task is to assign labels to samples in . We define a gram matrix on , such that: (1) where controls the width of RBF kernel and stands for a distance function on pixel space. Then we use harmonic function to propagate labels on the graph by connecting each adjacent pixel pair with an edge. This procedure is something like heat conduction. Eq. (2) describes this procedure: (2) where is a real value function to assign label for each node. We want to find an optimal to describe the balance situation of heat conduction, as Eq. (3): (3) there is a considerable issues for . is harmonic, meaning that on , where is the combinational Laplacian and is a diagonal matrix with element . The optimal can be written down as following according to [8]: (4) And the value of can be determined as following: (5) (6) where and stand for the values of on and , and stand for the block matrix between samples in . . Eq. (6) gives us a close-form solution to directly solve . In our framework, should be described as a distribution. Since Gaussian distribution can be used to simulate arbitrary distribution around its mode, we model as following: (7) the distribution of a marginal term: is controlled by , normalized by (8) meaning that the integration on all seen labeled pixels. Eq. (5) to (8) provided a method for semi-supervised learning. Labels can be propagated from label pixels to unlabeled pixels. 3.4 Multi-label propagation In previous subsection we present the method of label propagation. It should be noticed that this method is only suitable for monochromatic images. To apply this method to our brain MRI image segmentation, the ability of propagating multiple labels at the same time is required. Volume 2, Issue 5 September October 2013 on We follow a simple idea that propagate different labels separately and then apply a majority voting scheme to determine the final label for each pixel. There may be three cases in a pixel: Case 1: Only one major label value Case 2: Two or more major label values Case 3: No major label value The major label value is evaluated by a preset threshold. In our experiment we set it to 40%. Case 1 indicates that the model is very confident to label a pixel. Case 3 indicates that the current pixel is far away from all labeled points. Case 3 is often met when propagating in a lesion region. Case 2 indicates the current pixel may be located in an overlapped region. Changing the threshold can control the sensitivity of the model to recognize the above cases. To finally determine a label, we use majority voting as Eq. (9) shows: (9) where is the total number of labels. We should point out that in our multi-label propagation method that there is a weak condition, i.e. when propagating, different labels are independent identical distribution (i.i.d). It is reasonable for most cases and this assumption reduces large amount of computational costs. In fact, the labels stand for pixels belonging to the concerning regions. Labels may obey some unknown probabilistic distribution of their spatial distribution, meaning that they are not i.i.d. in fact. However we will show empirically it works in most cases and achieves acceptable accuracy of lesion region recognition. 3.5 Lesion region recognition To effective classify lesion regions from normal regions, we train a support vector machine (SVM) to perform this task. The reason for building this classifier is that there are potential principles of lesion regions which cannot be directly determined by a single threshold. We notice that lesion region often locate in a small region, and the number of pixels belong to such area is small. A key problem is to turn each region into a real vector representation, so as to meet the requirement of SVM. Based on our previous work [9][10], we apply a wavelet transformation feature extraction method to extract key features of each generated region. The main steps are listed as following: 1. Divide a given image blocks of size 2. For each block , apply wavelet transformation to get LL, LH, HL and HH coefficients. 3. According to [11], calculate a 9-ary real vector Then a region is expressed as a 9-ary real feature vector. The motivation for applying the feature extraction procedure proposed in [9] is that it extracts both texture

Page 160

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
and inner structure features, which captures the key features of brain MRI images. After transforming to a real feature vector, it is possible to build a recognition model for lesion regions. As mentioned in the previous subsection, only Case 2 and Case 3 are candidates for lesion regions. Thus we feed regions (represented by 9-ary real vector) to train a SVM classifier. The training dataset is manually labeled. Two SVMs are trained separately each for one case. The reason of this setting is that the underlying features of lesion regions located in single standard area and in overlapped area may be different. Hence it is necessary to distinguish them. Table 2 shows the evaluation result of the proposed method on 5 datasets used in our experiment. Results of two settings are reported in Table 2. Table 2: Performance of the proposed method
DataSet Name D1 D2 D3 D4 D5 Semi 82.1% 84.8% 91.2% 86.0% 89.5% Transductive 83.4% 84.0% 89.4% 88.2% 91.2%

4 EVALUATION
The proposed method is evaluated on a real dataset from IBSR project. In our experiment, we use 5 datasets of brain MRI. These dataset are different in clarity, severity and number of lesion regions. Table 1 lists the datasets used in our evaluation. Table 1: Evaluation datasets
No. D1 D2 D3 D4 D5 Size 181 138 145 112 170 Description IBSR_V2.0 skull-stripped NIfTI 20 Normals Male Subject, T1-Weighted Brain Scan Normal Subject, 'Ideal' Registered Multi-echo Brain Scan Tumor

The first is a semi-supervised setting (the Semi column in Table 2) that both labeled and unlabeled samples are fed to the label propagation model. In this case the test dataset is unavailable when training. The second setting is a transductive setting. In this case the unlabeled dataset is composed of the original unlabeled dataset and the test dataset, meaning that the test data is available when training. The best result in each row is highlighted in Table 2. We can see that transductive learning do not always achieve best result though more data samples were seen. Finally we report the comparison result between some recent methods and the method proposed in this paper. Two methods are studied in the evaluation. The first is proposed in [2], which applied a SOM model for brain MRI segmentation. And the second is proposed in [15], which is a combination of 2D wavelet transformation and neural network. Both of them have been reported effective on their evaluation dataset. We implement these two methods and evaluate them on IBSR benchmark dataset. Note that since these methods are supervised, only labeled samples can be fed to their models. Figure 3 illustrates the comparison results.

Since the proposed method is semi-supervised, a labeled and an unlabeled dataset are required for model training. We divide each dataset into three parts at a ratio 3 : 2 : 5. For our method, the first two parts are used for training, and the third part is for testing. Also there are other methods for comparison which are supervised ones. For them we only use the first part for training and the third part for testing. In such setting, it can be clearly find our whether unlabeled data can improve the performance of our model. In detail, we implement a modified version of label propagation based on [7]; SVM classifier is implemented with LibSVM [13]; region feature extraction method is implemented following the ideas of [9] and [14]. The evaluation measurement is defined as the overlapping ratio between the pixels within ground truth region and the output of our model. All regions concerned in this work have clear borders. Since the output is rough, a threshold of overlapping ratio is set to 80%. When the ratio exceeds this threshold the model output is correct. Based on this measurement we can simply obtain the model accuracy by counting the correct classified regions of each MRI image. Volume 2, Issue 5 September October 2013

Figure 3 Comparison between three methods In Figure 3, SOM, NN stand for the methods proposed in [2] and [15], and LP stands for the proposed method. From Figure 3 we can see that our method achieve best performance among these methods.

5 CONCLUSION
We have proposed a brain MRI image segmentation method based on semi-supervised learning. By using label Page 161

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
propagation, region information can be transferred from known pixels to unknown ones along with the similarity definition between adjacent pixels. Also, our method can be launched in a semi-supervised or transductive manner, when test samples are available in the training period. Evaluation on IBSR brain MRI dataset shows that the proposed method is superior to recent methods. By introducing different learning frameworks into brain MRI segmentation, the power of machine learning can be fully released and more effective methods for MRI analysis can be developed. Conference on Bioinformatics and Biomedicine, IEEE Computer Society, 2012, 0, 1-6. Pan, Q.; Zhang, G.; Zhang, X.-Y.; Huang, Z.-M. & Xiong, J. Image classification by multi-instance learning with base sample selection International Workshop on Image Processing and Optical Engineering, 2012, 83-89. Gersho, A. Asymptotically optimal block quantization Information Theory, IEEE Transactions on, Information Theory, IEEE Transactions on, 1979, 25, 373-380 IBSR project: http://www.nitrc.org/projects/ibsr LibSVM: www.csie.ntu.edu.tw/~cjlin/libsvm/ Chen, Y. & Wang, J. Z. Image Categorization by Learning and Reasoning with Regions J. Mach. Learn. Res., 2004, 5, 913-939. S.Javeed Hussain and A. Satya Savithri. Segmentation of Brain MRI with Statistical and 2D Wavelet Features by Using Neural Networks. 2011, IEEE, 154-159.

[10]

[11]

Acknowledgment
The authors would like to thank Xiao-bo Huang for his professional advice for this paper. This work is supported by the Science and Technology Planning Project of Haizhu District, Guangzhou (2011-YL-05), the 2012 College Student Career and Innovation Training Plan Project (1184512043), the 2011 Higher Education Research Fund of GDUT (2013Y04).

[12] [13] [14]

[15]

References
[1] Image similarity and tissue overlaps as surrogates for image registration accuracy: Widely used but unreliable, IEEE Transaction on Medical Imaging, vol. 31, no. 2, pp. 153-163. [2] T.LOGESWARI and M.KARNAN. Hybrid Self Organizing Map for Improved Implementation of Brain MRI Segmentation. In Proceedings of International Conference on Signal Acquisition and Processing, 2010, pp. 248-252. [3] Jin Kyu Gahm, Geoffrey L. Kung, Daniel B. Ennis. Weighted Component-based Tensor Distance Applied to Graph-based Segmentation of Cardiac DT-MRI. In Proceedings of IEEE 10th International Symposium on Biomedical Imaging, 2013, pp. 504507. [4] Rie Johnson, Tong Zhang. Graph-based Semisupervised Learning and Spectral Kernel Design. 2006. [5] Brian Kulis, Sugato Basu, Inderjit Dhillon, Raymond Mooney. Semi-supervised graph clustering: a kernel approach. Machine Learning, 2009, 74, pp. 1-22. [6] Kevin W. Bowyer, Karen Hollingsworth, Patrick J. Flynn. Image Understanding for Iris Biometrics: A Survey. Computer Vision and Image Understanding 2008, 110 (2), pp. 281-307. [7] Xiaojin Zhu, Zoubin Ghahramani, John Lafferty. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. In Proceedings of the 12th International Conference on Machine Learning, 2003, pp. 21-29. [8] Doyle, P., & Snell, J. Random walks and electric networks. Mathematical Assoc. of America, 1984. [9] Zhang, G.; Shu, X.; Liang, Z.; Liang, Y.; Chen, S. & Yin, J. Multi-instance learning for skin biopsy image features recognition 2012 IEEE International Volume 2, Issue 5 September October 2013 AUTHOR
Zhang Yong, M.D. Director of Neurosurgery department of Guangdong NO.2 provincial hospital. The top expert of China in the field of cranial nerve diseases.

Luo Weishi, Attending doctor of neurosurgery. Mater of neurosurgery. specializing in diagnosis and treatment of cranial nerve diseases.

Zhang Yang, Neurosurgery resident, specializing in intraoperative neural electrophysiological monitoring.

Gang Zhang is PhD candidate in the School of Information Science and Technology at SUN YATSEN University, China. He received his MSc Degree in Computer Software and Theory from SUN YATSEN University, China, in 2005. His current research interests include data mining, machine learning, and its applications to bioinformatics and Traditional Chinese Medicine. Now he is a lecturer in School of Automation, Guangdong University of Technology. Qian Dongxiang is PhDMD of the Third Affiliated Hospital of Guangzhou medical University. As the director of Neurosurgery department, he does great job in clinical work, medical education and medical research concurrently. His research direction is injury and repair of central nervous system. According to dedication in much academic area, he gains lots of honor in Neuroscience academia. And, He was commended to assume the responsibility for various social duty, such as the associate director of Neurosurgery branch in Guangzhou

Page 162

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)


Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 5, September October 2013 ISSN 2278-6856
institute of medicine, the national commission of Society for Neuroscience of China, the commission of Neurosurgeon branch in Guangdong physicians society, and so on. Zhang Qi is a master candidate in the school of GuangZhou medical University. His research direction is injury and repair of central nervous system.

Huang Ying is MD of Faculty of Automation,


Guangdong University of Technology. Her research direction includes intelligent information processing and computer vision.

Volume 2, Issue 5 September October 2013

Page 163

You might also like