Shah

2015 13th International Conference on Frontiers of Information Technology
Technology
Decision Trees based Classification of

Cardiotocograms using Bagging Approach
Syed Ahsin Ali Shah Wajid Aziz Muhammad Arif Malik Sajjad A. Nadeem
Department of Computer Department of Computer Department of Computer Department of Computer
Sciences and Information Sciences and Information Science, Sciences and Information
Technology. Technology. College of Computer and Technology.
University of AJ&K, University of AJ&K, Information System University of AJ&K,
Muzaffarabad Muzaffarabad Um-Alqura University, Muzaffarabad
Azad Kashmir, Pakistan. Azad Kashmir, Pakistan. Makkah, Saudi Arabia. Azad Kashmir, Pakistan.
ahsin.shah@gmail.com kh_wajid@yahoo.com mahamid@uqu.edu.sa msajjadnadeem@gmail.com
Abstract— Cardiotocography (CTG) is a worldwide method The intend of CTG monitoring is to provide the more
used for recording fetal heart rate and uterine contractions during appropriate information related to the fetal condition in contrast
pregnancy and delivery. The consistent visual assessment of the to irregular auscultation [2]. However the explanations of
CTG is not only time consuming but also requires expertise and information provided by CTG are not according to the any
clinical knowledge of the obstetricians. The inconsistency in visual
evaluation can be eliminated by developing clinical decision
standard. This lack of information provided by CTG is one of the
support systems. During last few decades various data mining and main reasons of growing cesarean births [3]. Obstetricians can
machine learning techniques have been proposed for developing read four fundamental and important constraints from the CTG
such systems. In present study, bagging approach in combination records. These constraints are:
with three traditional decision trees algorithms (random forest,
Reduced Error Pruning Tree (REPTree) and J48) has been applied Fetal heart rate baseline (BL): It is the range of the values
to identify normal and pathological fetal state using CTG data. from 110 to 150 beats per minute (bpm). The BL with values
Studies show that decision trees algorithms and bagging have above 150 bpm is the indication of tachycardia. Tachycardia
separately shown tremendous improvements in the classification of indicates that the fetus may be suffering from fever or any
healthy and pathological subjects in medical domain. The infection that may lead to fetal distress with many disorders. The
parameters of classifiers were optimized before applying on the smaller values of BL i.e. less than 110 bpm is an indicating factor
data sets. The ten folds cross validation is used for examining the of bradycardia. A continuous decrease in rate indicates fetal
robust of the classifiers. The degree of separation was quantified distress.
using Precision, Recall and F-Measure. At first full feature space
have been analyzed using proposed bagging based decision trees Acceleration (ACC): Acceleration is the raising of BL above
algorithms. Then by using correlation feature selection - subset 15 bmp for minimum 15 seconds. The occurrence of acceleration
evaluation (cfs) method, a reduced feature space has been obtained must be twice per 15 minutes. The acceleration during the night
and analyzed using proposed method. The overall classification may be an indication of pathological state or could be a response
accuracy of more than 90% has been obtained by the classifiers on
the test set when full feature space is used. For all three to fetal movement.
performance measures, values greater than 0.90 has been achieved Deceleration (DCL): A decrease in the BL more than 15 bpm
with full and reduced feature space. The proposed methodology for at least 15 seconds shows deceleration. Along with uterine
showed better classification in both full and reduced feature space
scenarios.
contractions it can be indicating factor of fetal hypoxia.
Variability: These are the variations that do not lie in the
Keywords: Bagging, Cardiotocography, Decision Trees, Fetal
Heart Rate. acceleration and deceleration. The values of variability during
sleep differ from values during activity.
I. INTRODUCTION
The above mentioned constraints are very important and
Obstetric complications during labor and delivery adversely obstetricians can use these constraints to classify the fetus into
affect the fetal and mother wellbeing. The fetal distress during three groups i.e. Normal, Suspicious and Pathological [4].
intrapartum period causes continuous deficiency in the oxygen Normal findings are listed in the normal group (standard
levels of fetus resulting in perinatal morbidity and mortality [1]. condition). The basal rate for normal group lies in a range from
Cardiotocography (CTG) is the most frequently used method to 110 to 150 bpm. Suspected findings are included in the
diagnose fetal distress through listening the Fetal Heart Rate suspicious group and ranges of basal frequency for suspicious
(FHR) and monitoring Uterine Contractions (UC) [1]. The trace group are between minor tachycardia (from 150 to 170 bpm) and
patterns of cardiotocography are helpful for the doctors for severe bradycardia (from 110 to 100 bpm). Light decelerations
analyzing the state of the fetus. Consistent monitoring and can also be a symptom. It can be an indication of normal fetal
appropriate interventions are important for preventing the sleep or early chronic fetal hypoxia. Pathological findings comes
maternal and fetal morbidity and mortality [1].
978-1-4673-9666-0/15 $31.00 © 2015 IEEE 12

28
DOI 10.1109/FIT.2015.14
in the category of pathological group where the basal frequency In this study, bagging approach in combination with decision
is higher than 170 bpm (severe tachycardia), or lower than 100 trees algorithms have been applied to identify pathological and
bpm (severe bradycardia). The decelerations can be periodic, suspicious fetus from normal ones using publically available
heavy, variable and prolonged. It is recommended that pregnancy CTG data from UCI repository [14]. As decision trees algorithms
must be terminated in case of late chronic hypoxia or anemia [5]. we separately use random forest, Reduced Error Pruning Tree
(REPTree) and J48. The available data has been used in two
The advent of modern computing technologies (both modes: (1) complete dataset with all features and (2) data with
hardware and software) has enabled the research community to reduced feature space using correlation feature selection - subset
use computer-based approaches in order to quantify the evaluation (cfs) method. The degree of separation was quantified
cardiotocogram. To evaluate the fetal distress, discriminant using Precision, Recall and F-Measure. The overall classification
analysis (DA), decision tree (DT), and artificial neural network accuracy of more than 90% is obtained by the classifiers on the
(ANN) have been proposed by Huang and Hsu [6]. They use 80 test sets. Furthermore, comparable results were obtained when
%, 10 % data for training and testing respectively and remaining feature size is reduced and only relevant features are considered
10% for validation. They achieved the accuracies of 82.1%, and compared to full features.
86.36% and 97.78% against DA, DT and ANN respectively. A
supervised ANN for the classification of CTG data was II. MATERIAL AND METHODS
implemented by Sundar et al. [7]. They analyzed the results using
rand index, precision, recall and f-score measures and achieved A. Datasets
very good classification accuracy. Yılmaz and Kılıkçıer [8] uses The data sets used in the study were taken from publically
the least squares support vector machine (LS-SVM) by utilizing a available UCI Machine Learning Repository [15] where SiSPorto
binary decision tree for classifying the cardiotocogram data in 2.0 software was used for analyzing the CTG records in order to
order to identify the state of the fetus. The parameters of LS- determine the basic attributes associated with CTG [16]. These
SVM are optimized by Particle Swarm Optimization (PSO). attributes include Fetal Movement (FM), Uterine Contraction
They achieved 91.62% classification accuracy rate. Jongsma and (UC), BL, ACC, DCL and Variability. Moreover, additional 14
Nijhuis [9] used statistical method for classifying the recordings statistical variables were also calculated. The description of
of FHR signals recorded during 38 and 40 gestational week. attributes is presented in the Table I. On the basis of these
Linear discriminant function is used for classifying the FHR attributes CTG data can be categorized into two groups i.e. as
segments in which FHR segments of 3 min as Auto-regressive FHR pattern class or fetal state class code. In fetal state class
moving-average (ARMA) model parameters were used as code, data samples can be classified as normal, suspicious or
features. They achieved the classification accuracy of 85 %. For pathological group. In this study, fetal state class code is used as
predicting that either fetus is hypoxic or healthy Georgoulas et al. target attribute. The dataset contains total 2126 samples in which
[10] used Hidden Markov Model (HMM). They train two normal group comprises of 1655 samples, suspicious group
separate HMM for both time and frequency domain features. The includes 295 samples and 176 samples belongs to pathological
rate of classification accounted subsequent to testing with HMM group.
configurations was 83%. For classifying that FHR signals are
TABLE I: DESCRIPTION OF CTG ATTRIBUTES
normal or pathological, Spilka et al. [11], used nonlinear features
that include fractal dimension, approximate entropy and Lempel
Ziv complexity along with Naive Bayes and SVM classifiers. BL
Fetal Heart rate baseline
WIDTH
Width of FHR
Based on 189 recordings, they achieved overall sensitivity and (beats per minute) histogram
No. of fetal movements Minimum of FHR
specificity of 70 %. FM
per second
MIN
histogram
No. of uterine Maximum of FHR
An ensemble method construct a set of classifiers and then UC
contractions per second
MAX
histogram
classify new data points by taking a (weighted) vote of their ACC No. of accelerations per No. of histogram
NMAX
predictions [12]. Bayesian averaging is the real ensemble second peaks
method, later bagging and boosting ensemble algorithms were DCL No. of light decelerations No. of histogram
NZEROS
per second zeros
developed to improve classification ability [12]. Bootstrap No. of severe Histogram
aggregation, or bagging was proposed by Breiman [13]. This DS Histogram mode
decelerations per second mode
approach along with many classification and regression methods DP
No. of prolonged
MEAN Histogram mean
is used for the reduction of the variance associated with decelerations per second
Percentage of time with
prediction, hence improving the prediction process. Bagging is a ASTV abnormal short term MEDIAN Histogram median
simple method in which, from the available data the samples of variability
bootstrap are extracted, after this any prediction technique is MSTV
Mean value of Short term
VARIANCE Histogram variance
applied on the each sample and finally, the results are combined variability
Percentage of time with
by averaging (for regression) and simple voting (for ALTV abnormal long term TENDENCY Histogram tendency
classification) for obtaining the overall prediction in which the variability
variance is decreased because of averaging. MLTV
Mean value of long term
CLASS
FHR pattern class
variability code (1 to 10)
NSP fetal state class code (Normal=1; Suspect=2; Pathologic=3)
13
29
B. Feature Selection
One of the significant step of various artificial intelligence
and pattern recognition problems is feature selection [17].
Feature selection is a method in preprocessing which is used to
identify the suitable features and it plays an important role in
classification. In order to produce a reduced data set various
feature selection approaches along with various search methods
are available. Hybrid, Wrapper and Filter methods are generally
regarded as feature selection methods and the result obtained
from these methods differs in accuracy as well as in time.
In feature selection process firstly, candidate feature subset is
produced from the original dataset after that the produced
candidate subset is assessed by means of some assessment
functions which comprises of classifier error rate, dependency,
information, consistency and distance [18]. Generally relevancy
value is produced using these functions and these values are
further used as termination condition in order to conclude
whether the selected subset feature is optima or not. Feature
selection methods detect dependencies between features. In this Figure 1. Model of Bagging Process
study we use correlation feature selection - subs t evaluation (cfs)
method using greedy stepwise search for the selection of WEKA (Waikato Environment for Knowledge Analysis) is a
important attribute. Cfs is the dimension reduction method that tool widely used for data mining tasks [21]. In this study, using
compute the association (correlation) between features and WEKA software, three different learning algorithms random
classes and separates the feature which are not suitable. Seven forest, REPTree and J48 are used as base classifier in bagging
features (AC, DS, DP, ASTV, MSTV, ALTV and Mean) were based classifiers.
obtained using cfs. D. Random Forest
C. Bagging (Bootstrap Aggregation) Random forest is the combination of different decision trees,
In supervised machine learning, ensemble methods are very used to classify the data samples into classes. It is commonly
popular because of the ability to accurately predict class labels. used statistical technique used for t e classification. The worth of
An ensemble method uses more than one cla sifier to achieve each distinct tree in not essential, the purpose of random tree is to
overall better accuracy. Classical ensemble methods such as reduce the error rate of the whole orest. The error rate depends
bagging and boosting, have good predictive capability. Bagging upon two factors i.e. correlation between two trees and the
method was proposed by Breiman [13]. strength of the tree.
In bagging algorithm form original training data set, N Two important parameters are associated with random forest.
different samples called bootstrap samples [19] S1, S2, ……, SN First one is the total number of trees in the forest, second is the
are generated. A classifier Cn is built against each bootstrap number of predictive variables used to split the nodes of a tree.
sample Sn. From classifier C1, C2, ……, CN, a inal classifier CL In order to minimize the overall error rate these two parameters
is built whose output is the class predicted most often by its sub- should be optimized. In order to est mates the important variables
classifiers. The bagging process is shown in fig. 1. in the classification, random fores is an efficient method. The
algorithm to construct each tree in random forest is as follows:
Bagging algorithm for multiple classification [20].
 If the original training data set comprises of S different
 Let T is the training set cases, then select S samples randomly (with replacement).
 for i=1 to N For constructing the tree, these S samples are training set.
Create a new set Si (bootstrap sampl s) by randomly  For a set of T input variables, select a distinct variable t
selecting the examples from the training set T. the size of such that t<T at each node. Select the t variables
Si must be equal to size of training set T. randomly, out of T. the best split on the t is used for node
splitting. During the construction of forest, the value of t is
Learn the classifier Ci: i=1 to N for training set Si by using set constant.
machine learning algorithm.
 Each tree is constructed to the largest possible level but
 Create a final model by combining all t e classifiers (C1, without pruning.
C2, …, Ci ) having majority votes.
 As the error rate in random forest depends upon
correlation and strength, therefore reducing t decreases
both correlation and strength
14
29
E. Reduced Error Pruning Tree (REPTree)
REPTree method is proposed by Quinlan [22]. The REPTree
algorithm generates a decision tree, by calculating the
information gain using entropy. It helps to decrease the decision
tree model complexity by “reduced error pruning method” and III. RESULTS AND DISCUSSION
also reduces the error which arises from variance [23]. The
information gain [24], is a criteria that uses entropy as measure,
and select the attributes having maximum information gain. Let T
be a set of examples containing m elements belong to class X and
n elements belong to class Y. The information required for
deciding whether a random example from T belongs to X or Y is
defined as
t t ƒ ƒ
I(t, f ) = — log2 — log2 (1)
t+ƒ t+ƒ t+ƒ t+ƒ
If Ti comprises of mi examples belonging to X and n i

examples belonging to Y, then the expected information required
(entropy) to classify examples in all sub trees Ti is [25]. Figure 2 Bagging approach in combination with decision trees algorithms
v for full feature space

ti i
E(A) = Σ1 t +ƒ
I(ti, fi) (2)
The pruning method in decision trees can be done as post-

pruning or pre-pruning. Pre-pruning generates trees more
rapidly, whereas post-pruning generate more effective trees [26].
F. J48 (C4.5)
In data mining decision tree is one of the methods used for
classification. For the classification of the instance, decision tree
moves thoroughly from root to leaf node of the tree. ID3
proposed by Quinlan [27], is one of the prominent decision tree
algorithm. ID3 split attribute for classification on the basis of
information entropy. C4.5 is an extended version of ID3, also
proposed by Quinlan [28]. C4.5 deals with continuous values and
also can handle missing values in attributes. In Weka an open
source Java implementation of the C4.5 algorithm is J48. The
C4.5 algorithm works by taking three parame ers as input, i.e.
training data set along with their class labels, list of attributes that
describe the training data set and selection method for attribute.
A heuristics approach is used for attributes selection, that can Figure 3 Bagging approach in combination with decision trees algorithms
best differentiates data tuples according to class. Usually gini for selected feature space
index is used as attribute selection method fo binary tree and The condition of fetus can be understood by the observation
information gain is used for multiway splits. of cardiotocography trace patterns. These patterns help doctors to
G. Proposed Methodology decide whether the fetus is normal or diseased. Observing the
For the classification of CTG data, proposed methodology of understandable information from CTG data is amongst the
prominent challenges in medical field, also the major difficulty
using decision trees algorithms in combination with bagging the doctors are suffering from is the diagnosis of the diseases.
approach has been used in two scenarios: complete feature space, The patterns can be drawn out from the medical data using the
and reduced
For both feature
of the spaceavailable
scenarios (using cfsdata
feature s lection
has been method).
randomly data mining techniques, which provide huge support to the
doctors to make decisions. Prior to application of data mining and
divided into train and test sets. Ten bootstraps of available train machine learning for the observation of FHR patterns, these
data have been obtained. For each of the booted set a classifier
patterns were observed manually. his manual observation may
has been built using one of the three decision trees algorithms. lead to mis-interpretation of FHR, hence leaving the
These steps have been repeated for all bootstra s. To avoid over
classification of fetus as healthy and diseased ambiguous.
fitting, 10 fold cross validation have been used to obtain the Oxygen deficiency at the time of delivery may result in long term
average results. Fig. 2 and 3 illustrates the working of proposed abnormalities and can lead to mortality. It is estimated that over
methodology for both scenarios of data sets.
15
29
50% of deaths were the result of unawareness about abnormal TABLE III: COMPARISON OF DECISION TREES CLASSIFIERS USING
FHR pattern. Now the CTG data is analyzed using the data BAGGING APPROACH IN TERMS OF PERFORMANCE MEASURES.
mining techniques and these techniques are helping hand in
CTG Data with all features (21)
avoidance of human mistakes and to take correct decisions. Class
Performance
Classifiers Measures Normal Suspicious Pathological
In this study, the classification results of bagging based
Precision 0.957 0.883 0.946
random forest, REPTree and J48 are compared. The tenfold cross Random Recall 0.984 0.769 0.898
validation is used to avoid over fitting for classification of normal Forest
F-Measure 0.971 0.822 0.921
and pathological subjects. In cross validation 9/10 of data is used Precision 0.951 0.888 0.903
for training of algorithm and remaining is used for testing by REPTree Recall 0.977 0.753 0.903
repeating this step 10 times. In Table II the results of the three F-Measure 0.964 0.815 0.903
classifiers for correct/incorrect classification using full (21) Precision 0.951 0.850 0.909
feature and most relevant (07) features are presented. J48 Recall 0.972 0.749 0.909
F-Measure 0.961 0.796 0.909
TABLE II: COMPARISON OF DECISION TREES CLASSIFIERS USING CTG Data with selected features (07)
BAGGING APPROACH FOR THE CTG DATA. Performance Class
Classifiers Normal Suspicious Pathological
Measures
Attributes
Precision 0.955 0.852 0.915
Classifiers Classification Random Recall 0.973 0.763 0.920
21 07 Forest
(Complete (Seleccted F-Measure 0.964 0.805 0.918
features) features) Precision 0.951 0.842 0.913
Accuracy 94.73% 93.93% REPTree Recall 0.970 0.756 0.898
Random Forest
Error 5.26% 6.07% F-Measure 0.961 0.796 0.905
Accuracy 93.98% 93.84% Precision 0.954 0.865 0.899
REPTree
Error 6.02% 6.14% J48 Recall 0.973 0.763 0.909
Accuracy 93.56% 93.46% F-Measure 0.963 0.811 0.904
J48 6.44% 6.54%
Error
In fig. 4, the values of accuracies of random forest, REPTree

The findings indicated that random forest using bagging and J48 classifiers using bagging approach for full and selected
approach provides better classification for full and selected features are presented. It is clear from the figure that
features as compared to REPTree and J48. Although the classification accuracy of random forest is higher as compared to
classification accuracy of selected features is minimal as REPTree and J48. These findings indicate that all three bagging
compared to complete features but the difference between their based approaches provide better classification among normal,
accuracies is almost negligible and we have almost the same suspicious and pathological subjects of CTG data.
results with reduced space as compared to full feature space. This
shows that selected seven features (AC, DS, DP, ASTV, MSTV,
ALTV and Mean) are more appropriate features for the analysis Comparison of Decision Trees Algorithms
of CTG data. using Bagging Approach for Full features
In table III the results of decision trees classifiers using and Relevant Features
bagging approach for full and relevant features are presented in 95.00 94.73
term of F-Measure, Recall and Precision. Higher values
94.50
(approaches to 1) of all three performance measures correspond 93.93 93.98
Accuracy
to the better classification rate. It is clear from the table that 94.00 93.84 93.56
random forest using bagging approach provides higher values of 93.50
93.46
F-Measure, Recall and Precision as compared to REPTree and
J48. In case of full feature space the average values of all three 93.00
performance measure for random forest are (0.946, 0.947, 0.946) 92.50
respectively whereas in case of relevant features the average
Random Forest REPTree J48
values of all three performance measure for random forest are
(0.938, 0.939, 0.938) respectively. For complete features the
average values of all three performance measure for REPTree are Full Features Selected Features
(0.938, 0.940, 0.938) respectively, whereas in case of relevant
features the average values of all three performance measure for Figure 4 Comparison of Decision Trees Algorithms using Bagging
REPTree are (0.937, 0.938, 0.937) respectively. For all features Approach
the average values of all three performance measure for J48 are
(0.934, 0.936, 0.934) respectively, whereas in case of relevant IV. CONCLUSION
features the average values of all three performance measure for In this paper, the performance of three decision trees based
J48 are (0.933, 0.935, 0.933) respectively. algorithms, namely random forest, REPTree and J48 with
16
29
bagging approach were evaluated for the classification of CTG [10] G. Georgoulas, C.D. Stylios, G. Nokas, and P.P. Groumpos, “Classification
data. For the analysis, data set with complete features and of fetal heart rate during labour using hidden Markov models,” Proc. IEEE
Int. Joint Conf. Neural Network, vol.3, pp.2471–2474, 2004.
reduced features have been used. All three classifiers have shown
[11] J. Spilka, V. Chudacek, M. Koucky, L. Lhotska, M. Huptych, P. Janku, G.
almost similar classification accuracies on full features set. Georgoulas, and C.D. Stylios, “Using nonlinear features for fetal heart rate
Random forest performed slightly better (94.7%). Correlation classification,” Biomed. Signal Process. Control, vol.7, pp.350–357, 2012.
feature selection - subset evaluation (cfs) method was used for [12] T.G. Dietterich, “Ensemble Methods in Machine Learning,” In J. Kittler
the selection of relevant features. Classification accuracy of and F. Roli (Ed.) First International Workshop on Multiple Classifier
random forest and other classifiers by using proposed Systems, Lecture Notes in Computer Science, New York: Springer Verlag,
methodology of bagging approach are negligibly degraded in pp.1-15, 2000.
reduced feature space scenario. It may be concluded that AC, DS, [13] L. Breiman, “Bagging predictors,” Machine Learning, vol.24, pp.123–140,
DP, ASTV, MSTV, ALTV and Mean are the most relevant 1996.
features in the analysis and classification of cardiotocogram. [14] A. Frank, and A. Asuncion, “UCI Machine Learning Repository,”
[http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School
Moreover random forest classifier using bagging approach with of Information and Computer Science, 2000.
above mentioned seven features can be used efficiently for the [15] D.J. Newman, S. Heittech, C.L. Blake, and C.J. Merz, “UCI Repository of
classification of CTG data. Machine Learning Databases,” University California Irvine, Department of
Information and Computer Science, 1998.
The major limitation of the study is that bagging approach in
[16] D.A. de Campos, J. Bernardes, A. Garrido, J. Marques-de-Sá and L.
combination with decision trees algorithms were used for Pereira-Leite, “SisPorto 2.0 A Program for Automated Analysis of
publically available secondary database. The authenticity of the Cardiotocograms,” J. Matern. Fetal Med., vol.5, pp.311-318, 2000.
proposed technique can be verified by using primary data for the [17] J.G. Zhang, and H.W. Deng, “Gene selection for classification of
classification of healthy and pathological data. Moreover, microarray data based on the Bayes error,” BMC Bioinformatics, vol.8,
classifiers other than decision trees algorithms can also be used pp..370, 2007.
with bagging approach. [18] M.A. Hall, and A.S. Lloyd, “Feature Subset Selection: A Correlation Based
Filter Approach,” International Conference on Neural Information
REFERENCES Processing and Intelligent Information Systems, pp.855-858, 1997.
[1] E.M. Karabulut, and T. Ibrikci, “Analysis of Cardiotocogram Data for [19] B. Efron, and R. Tibshirani, “An Introduction to the Bootstrap,” Chapman
& Hall, 1993.
Fetal Distress Determination by Decision Tree Based Adaptive Boosting
Approach,” Journal of Computer and Communications, vol.2, pp.32-37, [20] L. Breiman, “Bagging predictors,” Technical Report 421, Department of
2014. Statistics, University of California at Berkeley, 1994.
[. 2H1a]ll,MF..AEibe, H. Geoffrey, P. Bernhard, R. Peter, and H.W. Ian, “The
[2] G. Georgoulas, J. Spilka, P. Karvelis, V. Chudacek, C. Stylios, L. Lhotska, WEKA Data Mining Software: An Update SIGKDD Explorations,”
“A three class treatment of the FHR classification problem using latent Vol.11, Issue 1, 2009.
class analysis labeling,” In Proceedings of 36th Annual International
Conference of the IEEE Engineering in Medicine and Biology, Sheraton [22] J.R. Quinlan, “Simplifying decision trees,” International Journal of Man.
Chicago Hotel and Towers, Chicago, USA, pp.46-49, August 2014. Machine Studies, vol.27, pp.221–234, 1987.
[3] P.J. Steer, “Has Electronic Fetal Heart Rate Monitoring Made a [23] I.H. Witten, and E. Frank, “Data mining: practical machine learning tools
Difference?,” Seminars in Fetal and Neonatal Medicine, 13, WB Saunders, and techniques,” 2nd ed. The United States of America, Morgan Kaufmann
pp. 2-7, 2008. series in data management systems, 2005.
[4] T. Peterek, K. Jana, D. Pavel, and G. Petr, “Classification of [24] W. Peng, C. Juhua, and Z. Haiping Zhou, “An Implementation of ID3 -
cardiotocography records by random forest,” 36th International Conference Decision Tree Learning Algorithm,” University of New South Wales,
on Telecommunications and Signal Processing, TSP, pp. 620-623, 2013. School of Computer Science & Engineering, Sydney, NSW 2032,
Australia.
[5] A. Zarko, D. Devane, G.ML. Gyte, “Continuous cardiotocography (CTG)
as a form of electronic fetal monitoring (EFM) for fetal assessment during [25] L. Rokach, and M. Oded, “Top-Down Induction of Decision Trees
labour,” In Alfirevic, Zarko, Cochrane Database of Systematic Reviews, Classifiers—A Survey,” IEEE transactions on Systems, Man and
2006. Cybernetics—part c: Applications and reviews, vol.35, 2005.
[6] M. Huang, and Y. Hsu, “Fetal Distress Prediction Using Discriminant [26] E. Alpaydın, “Introduction to Machine Learning,” The MIT Press, Printed
Analysis, Decision Tree, and Artificial Neural Network,” Journal of and bound in the United States of America, 2004.
Biomedical Science & Engineering, vol.5, pp.526-533, 2012.. [27] J. R. Quinlan, “Induction of Decision Trees,” Mach. Learning, vol.1,
[7] C. Sundar, M. Chitradevi, and G. Geetharamani, “Classification of pp.81-106, 1986.
Cardiotocogram Data Using Neural Network Based Machine Learning [28] J. R. Quinlan, “C4.5: Programs for Machine Learning,” Morgan Kaufmann
Technique,” International Journal of Computer Applications, vol.47, pp.19- Publishers, 1993.
25, 2012.
[8] E. Yılmaz, and Ç. Kılıkçıer, “Determination of Fetal State from
Cardiotocogram Using LS-SVM with Particle Swarm Optimization and
Binary Decision Tree,” Computational and Mathematical Methods in
Medicine, 2013.
[9] H.W. Jongsma, and J. G. Nijhuis, “Classification of fetal and neonatal
heart rate patterns in relation to behavioural states,” Eur. J. Obstet.
Gynecol, Reprod. Biol. Vol.21, pp.293–299, 1986.
17
29

Shah

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Shah

Uploaded by

Copyright:

Available Formats

2015 13th International Conference on Frontiers of Information Technology

Decision Trees based Classification of

978-1-4673-9666-0/15 $31.00 © 2015 IEEE 12

If Ti comprises of mi examples belonging to X and n i

v for full feature space

The pruning method in decision trees can be done as post-

In fig. 4, the values of accuracies of random forest, REPTree

You might also like