You are on page 1of 9

Computers in Biology and Medicine 109 (2019) 226–234

Contents lists available at ScienceDirect

Computers in Biology and Medicine


journal homepage: www.elsevier.com/locate/compbiomed

A neural network approach to classify carotid disorders from Heart Rate T


Variability analysis
Laura Verde∗, Giuseppe De Pietro
Institute of High Performance Computing and Networking (ICAR-CNR), Via Pietro Castellino, 111, Naples, Italy

A R T I C LE I N FO A B S T R A C T

Keywords: Background: Atherosclerosis is a progressive process responsible for most heart diseases and ischemic stroke. It
Carotid diseases constitutes, in fact, the most common cause of stroke in middle-aged people. To avoid or, at least, limit the
Signal processing disabling deficits that may derive from a carotid disease, a prompt and early diagnosis is necessary. The diag-
HRV analysis nostic technique used to detect a carotid disease is the eco-color Doppler. Unfortunately, this method is not free
Correlation-based feature selection
from errors, due to manufacturer mistakes or its operator dependence.
Artificial neural networks
Methods: In this study, we propose an automated methodology capable of identifying the presence of a carotid
disease from the Heart Rate Variability analysis of electrocardiographic signals. A Correlation-based Feature
Selector for data reduction and Artificial Neural Networks are used to distinguish between pathological and
healthy subjects.
Results: A series of tests has been realized to evaluate the proposed approach by using electrocardiographic
signals selected from an available database in order to analyse the classification ability in comparison with other
algorithms existing in literature. The results obtained show that the proposed approach provides values of ac-
curacy, sensitivity, specificity, precision, F-measure and ROC area, respectively equal to 90.5%, 97.7%, 72.9%,
89.7%, 93.5% and 0.957, better than those achieved by other algorithms.
Conclusions: Considering the achieved accuracy, our methodology is more effective than any of the main al-
gorithm existing in literature. It is important to note that this approach is proposed as a support for the diagnosis
of a carotid disorder through a non-invasive approach.

1. Introduction Arterial compliance can be measured with several techniques.


Arterial blood flow, pressure and diameter changes can, in fact, be
Atherosclerosis is a degenerative disease characterized by a nar- observed with invasive and sophisticated clinical measurements, such
rowing of the arteries due to a build-up of plaque on the artery walls. as angiography, or non-invasive ones, such as computed tomography or
This plaque is composed of an accumulation of lipids, fibrous materials, Doppler ultrasonography (US). This latter is the most frequently used
calcium and other substances from the blood. Several arteries of the examination to evaluate the carotid arteries [1], permitting the as-
human body can be damaged by atherosclerosis such as the carotid sessment of both the morphological characteristics of the plaque as well
arteries, which are in the neck and supply blood to the brain. The as the flow peculiarities in the carotid arteries.
formation of plaque can cause a reduction in the blood and oxygen that The procedure to execute carotid US examinations is not standar-
reach the brain tissue and cells with serious consequences. Carotid dized from laboratory to laboratory [2], a fact which can cause errors in
atherosclerotic diseases constitute the most common cause of stroke in the measurements. US is, in fact, highly operator-dependent. A lack of
middle-aged people, in fact accounting for 20% of ischemic stroke [1]. experience and insufficient training can cause, for example, the in-
The precise causes of stroke are still unknown. It is thought that it can correct placement of the Doppler probe at the wrong Doppler angle [2]
be caused by a response to the damage to the endothelium of the ar- or an inappropriate selection of the Doppler setting parameters, such as
teries from high blood pressure, high cholesterol and cigarette smoking. the aperture size or beam steering, with a consequent incorrect diag-
Other modifiable risk factors are diabetes, obesity and physical in- nosis. Although some inaccuracies are due to manufacturer mistakes or
activity. Hereditary, gender and age constitute, instead, non-modifiable machine faults, the most significant errors are, in fact, caused by an
risk factors. incorrect procedure performed by the technologist [3]. Additionally,


Corresponding author.
E-mail addresses: laura.verde@icar.cnr.it (L. Verde), giuseppe.depietro@icar.cnr.it (G. De Pietro).

https://doi.org/10.1016/j.compbiomed.2019.04.036
Received 16 January 2019; Received in revised form 18 April 2019; Accepted 27 April 2019
0010-4825/ © 2019 Elsevier Ltd. All rights reserved.
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

motion artefacts of the patient or fluctuations in the walls of the blood directly carotid diseases by using a data mining approach, have ana-
vessels can influence the Doppler signal acquisition and consequently lysed carotid artery Doppler signals or US images, using, therefore, US
its processing and interpretation, due to the high sensitivity of US to data that could be affected by errors. Additionally, these studies have
motion. used private and not publicity available databases limiting the re-
For this reason, we have searched for an alternative technique that producibility of the tests. Latifoglu et al. [22], for example, classified
can support the diagnosis of a carotid disease, in that it is an approach the presence of pathologies using Artificial Immune Recognition Sys-
that can result in the correction of diagnostic errors. A notable asso- tems (AIRS). A private database consisting of 114 subjects was analysed
ciation between the state of health of the carotid arteries and the Heart to evaluate the performance of the proposed system. A supervised AIRS
Rate Variability (HRV) has been demonstrated by several studies ex- classifier was, also, used in the study described in Ref. [23]. In this
isting in literature [4,5]. This can constitute an alternative method of study, the authors developed a model where the input of the classifi-
diagnosing the presence, or not, of a carotid disease. The analysis of cation system is the maximum envelope of the carotid artery Doppler
HRV is a non-invasive approach for the evaluation of the cardiovascular signals. The classification accuracy was tested on a private database
activity though the estimation of several characteristic parameters composed of 60 patients suffering from atherosclerosis and 54 healthy
capable of providing valuable insights into physiological and patholo- subjects.
gical conditions [6,7], such as arrhythmia [8] or the systolic blood Uguz et al. [24], instead, developed a system based on a Learning
pressure drop due to orthostatic hypotension [9]. Measurements of Vector Quantization Neural Network (LVQNN) to classify the Doppler
HRV are simple to perform, non-invasive and easily reproducible. Many signals of 191 subjects by using a 5-fold Cross Validation. Additionally,
commercial devices provide an automated measurement of the HRV, Digernali et al. [25] using Principles Component Analysis (PCA) and an
reducing the operator dependence. Currently, there are numerous ac- ANN architecture, classified 57 pathological and 50 healthy subjects,
curate and simple tools for both research and clinical studies capable of selected from a private database of carotid arterial Doppler US signals.
analyzing electrocardiographic (ECG) signals in real-time [10,11]. A complex-valued artificial neural network was, instead, developed by
Signal processing and Artificial Intelligence (AI) techniques can Ceylan et al. [26], which was able to classify the presence of disease in
assist and support a medical diagnosis. Several areas of science and 78 subjects (38 pathological and 40 healthy). An ANN was also used by
engineering adopt these techniques, such as for example for medical Samiappan et al. [27] to classify carotid abnormalities in 361 US car-
imaging applications [12,13] or signal processing [14,15]. otid artery images selected from a private database. A diagnosis model
Based on these considerations, we propose a novel approach for the of artery stenosis was built, instead, by using the SVM and transfer
computerized recognition of a carotid disease based on HRV analysis function (TF), important parameters for the analysis and understanding
that can assist medical specialists in the task of classifying these dis- of the hemodynamics of the carotid arteries, in a study presented in Ref.
orders. The classification methodology consists of the Correlation-based [28].
Feature Selector (CFS) for data reduction and Artificial Neural Table 1 summarizes the different approaches outlined so far. Ana-
Networks (ANN) to distinguish between pathological and healthy sub- lysing these works, we can observe that there are not many studies
jects. We have conducted a comparative study of the main machine
learning techniques capable of identifying carotid diseases using HRV Table 1
parameters in Ref. [16]. The performances are evaluated in terms of Selected studies concerning automated carotid disease detection.
accuracy, sensitivity, specificity, precision, F-measure and ROC area. Ref. Database Data analysed Features Approach Performance
This paper is organized as follows. In Section 2, the main studies
about methods able to detect carotid diseases are presented, while [22] private Carotid artery Spectral AIRS acc: 99.3%a
Section 3 introduces the proposed approach. The testing phase, fo- attributes
Doppler signals sen: 98.2%b
cusing on the dataset used in the analysis, the several machine learning spec: 100%c
classifiers adopted to compare the performances, and the results [23] private Carotid artery Maximum SAMA acc: 98,9%a
achieved are presented in Section 4 and discussed in Section 5. Finally, envelope of
our conclusions and plans for future work are presented in Section 6. Doppler signals doppler sen: 99.6%b
sonograms
spe: 97.7%c
2. Related work [24] private Carotid artery Power spectral LVQNN acc: 97.9%a
Doppler signals estimates
In relation to carotid diseases, several researchers have investigated [25] private Carotid artery Spectral PCA-ANN acc: 97.0%a
the use of AI techniques to support specialist diagnosis. Distinct studies attributes
Doppler signals sen: 97.3%b
use a data mining approach to relate HRV parameters and carotid in- spe: 96.6%c
dexes. Unfortunately, in the majority of these research projects, this [26] private Carotid artery Spectral PCA- acc: 81%a
approach is used to estimate the presence of a cardiovascular disease. attributes
Lee et al. [17], for example, identify coronary heart disease using HRV Doppler signals CVANN
FCMPCA- acc: 100%a
features and the measurement of the carotid arterial wall thickness. 99
CVANN
patients with abnormal coronary arteries and 94 healthy subjects se- [27] private Carotid artery Image features ANN acc: 89.4%a
lected from a private database were studied. Various experiments were ultrasound sen: 88.4%b
conducted using linear and non-linear features of the HRV to evaluate images spe: 77.8%c
the classifiers. At the conclusion of these experiments, the authors PPV: 86.7%d
NPV: 53.8%e
claimed that Support Vector Machine (SVM) and Bayesian classifiers [28] private Samples using TF SVM acc: 76.0%a
outperformed the other classifiers. Additionally, Kim et al. [18] iden- the trasmission
tified cardiovascular diseases through a data mining approach using the line model of
HRV and images of the carotid arteries. The SVM and Classification human artery
tree
based on Multiple Association Rule (CMAR) showed the best accuracies
tested on a private database composed of 214 subjects. Moreover, a
acc: accuracy.
several studies have discussed the automated characterization of cor- b
sen: sensitivity.
onary artery disease (CAD) using linear and non-linear HRV features, c
spec: specificity.
such as [19–21]. d
PPV: positive predictive values.
e
The main studies, instead, that examine the possibility of identifying NPV: negative predictive values.

227
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

concerning the automated characterization of carotid diseases using the analysis of the signal. Due to the several sources of noise, the de-
HRV analysis. In most cases, carotid diseases are identified by analysing noising of the ECG signal constitutes a difficult problem. Several tech-
carotid artery Doppler signals, while the HRV parameters are mainly niques have been proposed in literature, such as methods that use filter
employed in the estimation of cardiovascular pathologies. Additionally, banks [31], wavelet transform, empirical mode decomposition [32] or
all the proposed techniques were tested using private databases, recursive filter [33]. In our study, to reduce the power line noise,
thereby limiting the test reproducibility and comparison between the baseline wander and other noise components a band-pass filter at 50 Hz
obtained results. On the basis of these considerations, we now propose a was used. Next, to estimate the HRV parameters, as specified in the
technique capable of detecting a carotid disease not by evaluating following subsections, a series of filters was employed. In detail, in
Doppler signals that can be source of error, as discussed in Section 1, accordance with the Pan Tompkins algorithm [34], first, a five-point
but based on an evaluation of the characteristic HRV parameters, which derivative was used to detect the pulse wave. After differentiation, the
are easy to estimate and constitute accurate indicators of several health- signal was squared point by point to emphasize the higher frequencies.
related issues. The classification accuracy of our approach has been Secondly, a moving-window integration was applied to obtain wave-
tested in comparison with the main techniques existing in literature form feature information in addition to the slope of the R wave. The
using an available database to allow the reproducibility of the tests and length of window is 150 ms.
a valid comparison.
3.2. Data processing
3. The proposed methodology
After denoising the ECG signal recordings, appropriate features are
The proposed methodology, capable of distinguishing between extracted. HRV parameters are used as features necessary to classify a
healthy and pathological subjects suffering from a carotid disease, sample as healthy or pathological. Additionally, a selection of the
consists of the processing ECG signals in order to extract characteristic characteristic features is performed to optimize the classification. The
features, the HRV parameters. These ECG signals can be acquired by extraction and selection of features are presented in the following
using opportune wearable sensors, such as the Zephyr Bioharness BH3 subsections.
[29] that can be integrated in easy m-health solutions, such as the one
described in Ref. [11], or can be retrieved by relevant databases. A 3.2.1. Features extraction
fundamental and essential preliminary task is the denoising of the ECG ECG signals are used to extract the characteristic parameters of the
signal to reduce the unwanted noise added during the acquisition. HRV analysis. This constitutes one of the main methods for the quan-
Successively, the estimated HRV parameters are evaluated to select tification of the functions of the cardiovascular system, focusing on the
those most relevant to the predictive model by using a specific features analysis of beat-beat fluctuations in the heart rate [6].
selection method, CFS [30]. Finally, the selected features are used as Several methods can be used to evaluate variations in the heart rate.
inputs to the ANN classifier, that identifies the presence, or not, of a One of the most straightforward consists in the evaluation of these
pathology in the subjects involved in the study. parameters in the time domain. In these measures the QRS complexes
An overview of the processing steps for the carotid health state are, generally, detected, with the so-called normal to normal (NN) in-
evaluation is shown in Fig. 1. More details about the proposed meth- tervals. These are defined as the intervals between adjacent QRS com-
odology are presented in the following subsections. plexes resulting from sinus node depolarizations. Simple statistical
variables can be evaluated, such as the mean between the RR intervals,
3.1. Preprocessing the standard deviation of the NN interval (SDNN) or the square root of
the mean squared differences of successive NN intervals (RMSSD).
The ECG signal represents the electrical activity of the heart. It Additionally, geometric measures can be evaluated such as the HRV
provides relevant information about the functional conditions of the triangular index and the triangular interpolation of the NN (TINN).
circulatory system and it is useful for the diagnosis and monitoring of The measurements estimated in the frequency domain are funda-
many cardiovascular diseases. Unfortunately, ECG signals are subject to mental. Power spectral density (PSD) analysis is commonly used to
contamination by various noises. The particular sensitivity of the ECG provide information about how power distributes as a function of fre-
to the noise is due to the low frequency-band of the signal (0.5–150 Hz). quency. In detail, three spectral components are distinguished in a
This band contains several sources of noise, such as, for example, the spectrum: very low frequency (VLF), low frequency (LF) and high fre-
fluctuations of the human organs and the consequent interference be- quency (HF) components. Variations of the modulations of the heart
tween the ECG and the electromyogram (EGM) signal. The EMG signal period influence the central frequency of the LF and HF. The VLF band
identifies the presence of a neuromuscular disease through the detec- (0.0033–0.04 Hz) is, in fact, constituted by rhythms with periods be-
tion of the electrical activity of the muscles. Other important sources of tween 25 and 300s, while the LF band (0.04–0.15 Hz) comprises of
noise are due to the materials used in the recording of the ECG signal, rhythms with periods between 7 and 25s and is affected by breathing
causing the well-known baseline wandering and power line inter- from 3 to 9bpm. Breathing from 9 to 24bpm influences the HF band
ferences. The former is due to the electrodes and the movements of the (0.15–0.40 Hz). The ratio between the sympathetic nervous system
patient during the recording of the ECG signal. Instead, the latter noise (SNS) and parasympathetic nervous system (PNS) activity under con-
is caused by the influence of the power line frequency of the recording trolled conditions is estimated by the ratio LF to HF power (LF/HF
machines (50 or 60 Hz). ratio).
Therefore, ECG denoising is a fundamental preliminary task before Finally, non-linear methods are used to characterize the HRV,

Fig. 1. Overview of the processing steps for carotid health state evaluation.

228
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

Table 2 Table 3
Selected HRV features. Descriptive statistics of the considered features.
HRV Features Units Description Mean Standard Quartile MS

Time Domain Deviation 1st 2nd 3rd


Mean RR ms Mean of RR intervals
SDRR ms Standard deviation of the RR intervals Mean RR 848.7 134.1 772.3 846.0 929.5 0.204
SDNN ms Standard deviation of NN intervals SDRR 30.5 64.8 13.3 18.5 27.6 0.098
SDSD ms Standard deviation of the successive difference RR SDNN 137.6 15.9 61.1 112.2 154.8 0.033
intervals SDSD 29.6 79.4 10.3 14.3 22.7 0.007
RMSSD ms Square root of the mean of the squares of the rMSSD 67.4 19.6 69.9 131.8 194.5 0.013
successive differences between adjacent NNs HF Components 53.2 18.4 40.1 52.0 66.3 0.130
Frequency Domain LF Components 38.3 14.9 28.1 40.5 48.1 0.130
HF Components % High Frequency power, from 0.15 Hz to 0.4 Hz LF/HF Components 1.5 5.4 0.5 0.8 1.2 0.175
LF Components % Low Frequency power, from 0.04 Hz to 0.15 Hz TINN 855.2 10.1 516.8 636.0 792.0 0.012
VLF Components % Very Low Frequency power, from 0 Hz to 0.04 Hz RR triangular index 4.0 3.4 2.8 3.5 4.4 0.022
LF/HF Components – Ratio between LF and HF band powers Stress Index 11.1 59.3 4.4 5.5 7.2 0.059
Geometrical measures SD1 118.6 13.1 51.1 93.3 137.8 0.013
TINN ms Baseline width of the RR interval histogram SD2 151.6 18.3 70.7 118.7 178.1 0.133
RR triangular index – The integral of the RR interval histogram divided SD2/SD1 11.8 11.9 1.1 1.4 1.7 0.139
by the height of the histogram ApEn 0.6 0.2 0.4 0.6 0.6 0.008
Nonlinear measures SampEn 0.2 0.1 0.1 0.2 0.3 0.057
Stress Index – Square root of Baevskys stress index Age 72 7 66 70 77 0.170
SD1 ms Standard deviation of the distance of RR(i) from the Gendera (male/female) 80/46 0.090
line y = x in the Poincarè plot Weight 78 12 69 78 86 0.038
SD2 ms Standard deviation of the distance of RR(i) from the Height 167 7 162 168 172 0.014
line y = -x+2RR in the Poincarè plot
SD2/SD1 – Ratio between SD2 and SD1 a
For gender, we indicated the number of males (80) and females (46) in-
ApEn – Approximate entropy volved in this study.
SampEn – Sample entropy

defined as:
through, for example, the Poincarè plot or entropy measures. In detail, krcf
the Poincarè plot is a representation of a time series in a phase space or, MS =
(k + k (j − 1) rff ) (1)
better, it is a graph in which each RR interval is plotted against the next
RR interval. The Poincarè plot analysis is a quantitative technique able
where MS is the heuristic “merit” of a feature subset S containing k
to provide visual information about the behaviour of the heart. In this
features, rcf is the mean feature-class correlation, and rff is the average
analysis the Standard Deviation1 (SD1) and Standard Deviation2 (SD2)
feature-feature inter-correlation. The numerator of Equation (1) pro-
are evaluated. SD1 is defined as the standard deviation of the in-
vides an indication of the predictive character of a set of features for the
stantaneous beat-to-beat R-R variability, while SD2 is the standard
class, while the denominator specifies the redundancy of the features.
deviation of the long term R-R interval variability. The entropy mea-
In this study, we have considered the attributes that have a CFS
sures, such as ApproximateEntropy (ApEn) and Sample Entropy
merit higher than 0.10, able to perform the best results in terms of
(SampEn), are widely used to identify the irregularities of data typical
accuracy in distinguishing between healthy and pathological subjects.
of physiological mechanisms or diseases.
CFS was chosen as the method to optimize and improve perfor-
Nowadays, these parameters are easy to estimate. Automated
mance of the classifier, due to its computational efficiency and avoid-
measurements of the HRV are now provided by many commercial and
ance of overfitting, as indicated in several studies [39,40], and its
reliable tools, such as Kubios HRV [35]. In our study we have con-
greater capability in selecting the most relevant data able to achieve the
sidered appropriate HRV measurements, reported in Table 2, estimated
best classification accuracy in distinguishing between pathological and
with the PanTompkins algorithm [34] by using the Matlab [36] and
healthy subjects. A comparative study between the performances
Kubios HRV software. Additionally, physiological information about
achieved using CFS and other feature selection methods has been car-
the patient has been used, such as gender, age, weight and height, due
ried out. The results obtained are reported in Section 4.
to the relationship between these data and the characteristics of carotid
Table 3 shows the descriptive statistics of the features considered.
artery diseases [37,38].
For each factor, the mean, standard deviation and the first, second and
third quartiles are presented. Additionally, the heuristic merit MS is
reported.
3.2.2. Feature selection
A feature is a quantitative property of a process and a set of features
can efficiently describe the input data necessary for any machine 3.3. Classification
learning classifier to perform a classification. To optimize the classifier,
it is necessary to remove any irrelevant features and insignificant in- The classification between healthy and pathological subjects has
stances to increase the prediction accuracy, allowing the learning been realized using an ANN. This is a parallel information processing
classifier to work faster and more effectively. system that simulates the human neural structure. An ANN consists of
Several feature selection methods exist in literature [39]. In this an appropriate number of elements that process the information and
study, the Correlation-based Feature Selector (CFS) [30] was adopted to adjust the network to provide different responses by using inputs and
select the relevant data. It belongs to the so-called filter methods, widely desired outputs.
adopted in literature for data reduction. Favoured for their simplicity In our study, we implemented a multilayer feed forward ANN using
and reliability, they use variable ranking techniques to select the re- the Matlab software package [36]. This type of ANN is an accurate
levant features. In particular, the CFS method is able to evaluate the predictor for the classification problem, particularly used due to the
predictive capability of each attribute, highlighting the set of attributes execution speed of the trained network, advantageous in signal pro-
that is most closely correlated with the class. The feature subsets are cessing applications [41]. A multilayer perceptron (MLP) is con-
classified according to a correlation based heuristic evaluation function, structed, composed of an input layer, an output layer and one or more

229
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

hidden layers. The processing elements are connected with each other
and the strengths of the connections are called weights.
In detail, neurons can be modeled as non-linear functions that
convert n inputs (x1, x2 , …, x i , ..x n ) into an output y. The result z, ex-
pressed by Equation (2), is obtained by performing a weighted sum of
the input data with a weight (w1, …, wn ) , associated to each input, and
with a bias w0 , used to adjust the working point of the neuron itself, that
is:
n
z= ∑ (wi xi) + w0
i=1 (2)

The output of the model y is given by applying a specific activation


function f, that is:
y = f (z ) (3)

In this study the activation function is the sigmoid function [42].


The curve of the sigmoid function crosses 0.5 at z = 0 and generates
outputs between 0 and 1.
The proposed neural network is constituted by two hidden layers
with, respectively, 27 and 10 neurons. To identify the most efficient
network topology, several approaches can be adopted, such as heuristic Fig. 2. An overview of the proposed ANN classifier.
or exhaustive searches, pruning or genetic algorithms. In our study, we
have conducted an exhaustive search to identify how many layers and randomly, the data into 10 subsets. Each time a subset was considered
nodes should be used in each layer to obtain the architecture capable of the test set for the testing of the algorithm, the remaining 9 subsets
achieving the best classification accuracy, thereby limiting the pro- represented the training set. This process was repeated until each subset
blems of overfitting [43], and computational time. The computational had been used as the test set.
simplicity and speed, and the reliability of our proposed methodology, In the following subsections, we present the dataset used to perform
in fact, are fundamental for its integration, as a possible future devel- the experimental tests and the results achieved using the other con-
opment, into a mobile health (m-health) solution for a mobile device. sidered feature selection methods and machine learning techniques.
The training procedure, necessary to adjust the connection weights,
is achieved by using a Levenberg-Marqued back propagation algorithm 4.1. The dataset
[44], a remarkably efficient tool and strongly recommended for neural
network training. It provides a numerical solution to the problem of The HRV analysis was performed by processing opportune ECG
minimizing a non-linear function and is preferable to other methods signals selected from the “Smart Health for Assessing the Risk of Events
existing in literature, such as the steepest descent algorithm or Gauss- via ECG database” [45] available on the PhysioNet website [46]. This
Newton algorithms. The former is an inefficient algorithm due to its dataset consists of 89 subjects suffering from a carotid disease (59 males
slow convergence, but is improved by the application of the latter. and 30 females) and 37 healthy subjects (21 males and 16 females).
However, this improvement only occurs when the quadratic approx- The presence of a carotid disease is evaluated by analysing the value
imation of the error function is reasonable as, otherwise, the Gauss- of the intima-media thickness (IMT). This measures the thickness of the
Newton algorithm would be mostly divergent. intima and media layers of the carotid arteries and constitutes the main
During the training procedure, the difference between the output of marker to identify the presence of a carotid disease [47]. An image
the system and the desired response is reduced by the continuous ad- processing workstation with the COMPACS software (Rev. 10.5.8,
justment of the weights. This difference can be referred to as the error Medimatic, Genoa, Italy) was used to estimate the IMT, while the
and can be measured as the Mean Squared Error (MSE). An overview of images were acquired by using a B-mode ultrasonography device
the proposed neural network architecture is shown in Fig. 2. (SONOS 5500, Philips). Three trained and experienced physicians
analysed all the measurements.
4. Evaluation of the proposed methodology
4.2. Classification accuracy
To evaluate the performance of the proposed methodology, an ex-
perimental phase has been carried out. The accuracy, sensitivity, spe- To define our classification methodology, we have performed an
cificity, precision and F-measure were determined to analyse the ability exhaustive search of the main machine learning techniques capable of
of our proposed methodology to classify correctly a subject as healthy achieving the best classification accuracy in distinguishing between
or pathological. Moreover, a receiver operating characteristic (ROC) healthy and pathological subjects [16]. We have compared the classi-
analysis was performed. This analysis is an appropriate means to dis- fication accuracy of several methods to identify the most reliable one,
play sensitivity and specificity relationships when a predictive output analysing several aspects, such as the feature selection method to re-
for two possibilities is continuous. In detail, the performance of the move unnecessary and irrelevant features and improve the perfor-
model is measured by calculating the area under the ROC curve (AUC). mance.
This may have values between 0 and 1: an AUC is equal to 1 when the Feature selection is an important step in data mining, necessary to
algorithm correctly classifies all samples, while it is equal to 0 when the extract the most relevant and useful information to improve the clas-
algorithm incorrectly classifies all subjects. sification accuracy by removing redundant, irrelevant and noisy fea-
The goodness of the proposed methodology was evaluated ac- tures. In order to chooce the best approach able to identify the most
cording to its classification accuracy, comparing this with that achieved significant features, we have conducted an extensive evaluation among
by the most frequently used machine learning techniques. A ten-fold the most commonly used feature selection methods.
cross validation was performed to evaluate the capability in classifying Besides the CFS method, other approaches exist in literature [39].
correctly the presence or not of a carotid disease. We divided, Principal Component Analysis (PCA) [48], for example, is a

230
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

mathematical procedure capable of transforming a number of corre- simplicity and resistance to overfitting. It is basically a two-class
lated variables into a smaller number of uncorrelated variables called classifier but can also be used for a multi-class classification. In this
principal components. This technique is based on the idea of finding an study, we applied the sequential minimum optimization (SMO) al-
appropriate reference system, in order to maximize the variance of the gorithm [51], by using a polynomial kernel with a degree equal to 1;
variables represented along the axes of the reference system. A linear • Decision Tree (DT) [52]: this classifies data based on their attri-
transformation of the original variables, capable of projecting them into butes, so creating a tree data structure. For our experiments, we
a new Cartesian system, in which these variables are ordered according used J48, an implementation of C4.5, one of the most popular de-
to the variance in descending order, is necessary. Therefore, the vari- cision trees;
able with the greatest variance will be represented on the first axis, the • 1R classifier [53]: this is a machine learning classifier effective on
second on the second axis etc. The reduction of complexity is obtained the standard datasets commonly used (OneR in WEKA). The
by analyzing the main variable, in terms of variance, among the new minimum-error attribute is used for the prediction, and the numeric
variables. A condition necessary for the correct use of PCA is that the attributes are discretized. In detail, a rule that predicts the class by
main components are independent. evaluating the value of the attributes is tested. This algorithm
The Relief [49] algorithm, instead, proposed by Kira and Rendell in chooses, and then bases the rules on, the most informative attribute;
1992, is a filter based approach that assigns a weight to each feature • Bayesian Classification (BC) [54]: this is based on a probabilistic
according to its relevance to the class. All the weights are, initially, approach for the classification of categorical data. We used the
equal to zero and then updated iteratively. This algorithm provides as Naive Bayes classifier in WEKA, which assumes that the effect of an
output a weight with a value between −1 and +1 for each attribute. A attribute is independent of the other attributes;
greater positive weight indicates a greater predictive value for the at- • Artificial Immune Recognition System (AIRS) [55]: this is a su-
tribute and, at the end of the process, those attributes with the highest pervised learning algorithm inspired by immunization metaphors.
weights are selected. According to this theory, through a blind process of selection, the
The results obtained using our methodology were compared with system is able to change itself in response to experience with the
those achieved with the PCA and Relief algorithms as the feature se- environment, to protect the organism from specific pathogenic
lection methods, as shown in Table 4. This comparison indicates that dangers. A set of real valued vectors are selected to classify patterns
the best performances were achieved by using CFS as the method to in order to test the memory cell used for the prediction and so
select the significant features, thereby indicating a further reason to realize a process of the cloning and mutation of cells. For our testing,
choose CFS as the feature selector, in addition to its computational we used the AIRS2 implementation in WEKA; and
efficiency and facility, as indicated in Section 3.2.2. • Instance-based Learning algorithm [57]: this evaluates the k nearest
The accuracy in classifying the presence or not of a carotid disease neighbor to decide the class to which it belongs. An example is
obtained is equal to 90.5%, better than the other methods. Using, in Locally Weighted Learning (LWL) [56], a non-parametric method. It
fact, our ANN architecture ed with the PCA and Relief algorithms as the achieves the prediction by using a local function with only a subset
feature selectors, the obtained accuracies were, respectively, 81.3% and of the data, building a local model for each point of interest, based
86%. The best results are also observed by comparing our results with on the neighboring data of the analysed classifiers.
those achieved considering all features. Although the specificity is
lower than that of the other methods (72.9% vs 76.5% considering all In Table 5, we have reported the performances estimated using the
features or 74.6% considering the Relief algorithm), our approach re- same feature selection method applied in our approach in order to
cognizes the presence of a carotid diseases when the subject does in- analyse the behaviour of all the considered machine learning classifiers.
deed suffer from this pathology, achieving a sensitivity of about 98%. The results obtained indicate that, generally, the best performance was
achieved using our proposed methodology. The achieved classification
4.3. Comparison with other classifier models accuracy, in fact equal to 90.5%, is higher than that of the other clas-
sifiers. Although other classifiers, such as SVM and LWL, obtained good
In order to provide an exhaustive analysis, we have compared the results in terms of sensitivity, the specificity of our proposed metho-
performance of our proposed methodology with the main machine dology (72.9%) is higher than that of these other two methods. In
learning techniques existing in literature. Each technique was chosen as particular, observing together the obtained sensitivity and specificity,
a representative of a class of algorithms based on similar characteristics. the proposed methodology recognizes, better than the other classifiers
It is important to remark here that other classification techniques are studied, not only the state of health of pathological subjects but also
not reported in this study due to their poor performance achieved that of healthy ones. This result is confirmed when the other considered
during our experiments. The analysis has been performed using the performance measurements, such as the precision, F-measure and ROC
WEKA tool [41], one of the most commonly used tools for data mining area are evaluated.
tasks.
In detail, the classifiers considered were:
4.4. Comparison with ensemble techniques

• Support Vector Machine (SVM) [50]: this is one of the most com- Recently, the combination of multiple models has been proposed as
monly used machine learning techniques due to its flexibility,
a new direction for the improvement of the performance of individual

Table 4
Comparison of the results obtained with other features selectors.
Accuracy Sensitivity Specificity Precision F-measure ROC

(%) (%) (%) (%) (%) Area

Our
Methodology 90.5 97.7 72.9 89.7 93.5 0.957
All features 84.4 87.7 76.5 83.6 86.2 0.859
PCA [48] 81.3 88.9 62.9 84.6 86.1 0.886
Relief [49] 86.0 86.6 74.6 87.6 87.2 0.857

231
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

Table 5
Comparison of the results obtained with several classifiers.
Accuracy Sensitivity Specificity Precision F-measure ROC

(%) (%) (%) (%) (%) Area


Our
Methodology 90.5 97.7 72.9 89.7 93.5 0.957
SVM [50] 73.0 97.8 13.5 73.1 83.7 0.500
C4.5 [52] 71.4 95.5 13.5 72.6 82.5 0.427
1R [53] 68.3 93.3 18.1 70.9 80.6 0.507
BC [54] 69.8 97.8 12.7 70.7 82.1 0.528
AIRS [55] 65.1 79.8 29.7 73.2 76.3 0.548
LWL [56] 70.6 96.6 18.1 71.7 82.3 0.508

models. Several machine learning techniques realize these multiple


models by learning an ensemble of models and using them in combi-
nation. Among these, the schemes called bagging, boosting and stacking
are commonly used in literature. These combined multiple models can
achieve a better reliability and accuracy than a single model, but the
combined models have the disadvantage of being difficult to analyse. In
fact, they can be composed of several individual models and it is not
easy to understand which factors are contributions to the improved
decisions [58]. In detail:

• Bagging (standind for Bootstrap Aggregating) [58]: this is a voting


method whereby base-learners are rendered different by training
them over slightly different training sets. It is a parallel ensemble
method, each model being built independently. In detail, the var-
iance of the predicted values is decreased by generating additional
data for training from the original dataset using combinations with Fig. 3. The accuracy of the different methods with the bagging technique.
repetitions to produce multisets of the same cardinality as the ori-
ginal data;
• Boosting [58]: this refers to a group of algorithms that utilize
weighted averages to transform weak learners into stronger lear-
ners. Unlike bagging which runs each model independently and then
aggregates the outputs at the end without giving any preference to
any model, boosting is a sequential ensemble method. It tries to add
new models that do well where previous models are failing. Each
model that runs dictates which features the next model will focus
on.
• Stacking [41]: this is a learning technique that combines multiple
classification. In this case, there is no empirical formula for the
weight function, but instead a meta-level is introduced. Another
model is used to estimate the input together with the outputs of
every model.

Based on these considerations, we have explored the ability of en- Fig. 4. The accuracy of the different methods with the boosting technique.
semble techniques to improve the classification performance. Observing
the results achieved by each ML algorithm, we investigated the ability Table 6
of ensemble techniques to improve the classification accuracy. Figs. 3 The performance after applying the stacking technique.
and 4 show the ability of, respectively, bagging and boosting to improve
Stacking Accuracy (%)
the classification accuracy, comparing the classification accuracy in
distinguishing between healthy and pathological subjects of single (in SVM-C4.5-1R-BC-AIRS-LWL 72.0
blue in the figure) and ensemble classifiers (in orange in the figure). SVM-C4.5-1R-BC-AIRS 69.8
Observing Fig. 4, bagging improved the accuracy of the Bayesian SVM-C4.5-BC-AIRS 70.6
SVM-C4.5-BC 69.0
classifier from 69% to 71.4%. Considering the other algorithms, the
SVM- BC 67.7
performances of the single cases and ensemble classifiers are very si-
milar, such as, for example, the SVM or LWL algorithms. Moreover,
when applying the boosting technique, in most cases the performances combination between the SVM, Decision Tree and BC achieved an ac-
of the single classifiers are equivalent to those of the ensemble classi- curacy equal to 70.6%. Similar accuracy values were achieved for the
fiers. In contrast, the accuracy of BC was increased, as also occurred by other combinations. It is important to remark here that other classifi-
applying the bagging technique. Finally, stacking allows us to improve cation techniques are not reported in this study due to their poor per-
the classification accuracy by combining multiple machine learning formance achieved during our experiments.
algorithms. In Table 6, the combination of various models is shown. The Table 7 reports a comparison between the performances achieved
best accuracy was achieved with the combination composed of (SVM - by our approach and the best ones obtained for each ensemble tech-
C4.5–1R–BC - AIRS - LWL) with a value equal to 72%, while the nique. LWL achieved the best performance both for bagging and

232
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

Table 7
Comparison of the results obtained with ensemble classifiers.
Accuracy Sensitivity Specificity Precision F-measure ROC

(%) (%) (%) (%) (%) Area


Our
Methodology 90.5 97.7 72.9 89.7 93.5 0.957
LWL (Bagging) [58] 71.4 96.6 10.8 72.3 82.7 0.578
BC (Boosting) [58] 70.4 89.9 22.2 74.1 81.2 0.553
SVM-C4.5-1R-BC- 74.2 96.5 21.7 74.3 84 0.566
AIRS-LWL (Stacking) [41]

boosting, while the combination SVM-C4.5-1R-BC-AIRS-LWL is the best indicators of several health-related issues. To the best of our knowledge,
when applying the stacking technique. This comparison indicates that, HRV parameters have mainly been evaluated in order to characterize
generally, the best performances were achieved by using our proposed coronary artery diseases. Additionally, these main studies use private
methodology. The classification accuracy is, in fact, higher than that of databases, not allowing therefore the reproducibility of the analyses
the ensemble classifiers. Although both the bagging and stacking and any accurate comparison.
combinations achieved a high sensitivity (about 96%), our metho-
dology achieved the best results for specificity, precision, F-measure 6. Conclusions
and ROC area.
In the last few years the incidence of carotid diseases has been in-
5. Discussion creasing dramatically. The consequences of these disorders are often
serious and sometimes fatal. Atheroschlerosis is, indeed, one of the
In this study, we have proposed a predictive tool that improves and main causes of morbidity and mortality in the world. This is a multi-
supports carotid disease detection through HRV analysis. HRV features, factorial vascular disorder characterized by a decrease in the blood flow
opportunely selected by using CFS to identify the most relevant data, and subsequent damage to the organs involved. Early detection and
constituted the inputs to the defined neural network, used to classify accurate diagnosis are essential.
the samples as healthy or pathological. In this paper we present an automated system for carotid disease
The performances of the proposed classifier were accurately tested detection. Our approach consists of a Correlation-based Feature
on an available dataset. A ten-fold cross validation was performed to Selector (CFS) for the data reduction and an Artificial Neural Network
train and test the proposed methodology. The classification perfor- (ANN) to classify a subject as healthy or pathological. Our testing
mance was measured using accuracy, sensitivity, specificity, precision, suggests that it is able to detect a carotid disease with a good accuracy
F-measure and ROC area. The results obtained showed a classification (with a percentage of 90.5%), and to recognize correctly the presence of
accuracy equal to about 90.5%. This good result was confirmed by the a pathology when the subject does indeed suffer from one of these
values of the sensitivity (97.7%), which means that our methodology diseases, the sensitivity being about 97.7% and the precision 72.9%.
recognized the presence of a carotid disease when the subject did in- The performances achieved with our methodology have been compared
deed suffer from this pathology. with the results obtained using several machine learning techniques. In
These results were compared with the performances achieved with detail, we have considered the Support Vector Machine, Decision Tree,
the main ML algorithms, namelys the SVM, Decision Tree, 1R classifier, Bayesian Classification, Artificial Immune Recognition System,
the Bayesian algorithm, Locally Weighted Learning and AIRS. For each Instance-based Learning algorithm and 1R classifier. Our approach has
technique, a preliminary study was conducted to identify the char- proved to be more effective than any of the other methods analysed in
acteristic parameters able to optimize its performance. The comparison terms of classification accuracy. The proposed methodology may con-
with these techniques showed the best accuracy was achieved by our stitute a valuable support to evaluate the state of health of the carotid
proposed classifier. Additionally, it achieved the best balance between artery, thereby assisting clinicians in the correct diagnosis of these
sensitivity and specificity. Although other classifiers, such as SVM and disorders, since it is rapid, non-invasive, easy to perform and relatively
LWL, achieved a high sensitivity, their specificity was quite low. This economical.
means that the ability of these algorithms to correctly classify a sample In our future plans, we aim to embed our methodology into a valid
as “diseased” is better than their ability to correctly classify healthy m-health solution, a non-intrusive system capable of acquiring data
subjects, risking the detection of a carotid disease even when the sub- from appropriate wearable sensors in real-time and of extracting
ject is healthy. characteristic HRV parameters, necessary to detect possible alterations
However, even if the proposed methodology achieved a specificity in the patient's state of health and to provide an accurate real-time
higher than the other classifiers (about 30% higher than the other support to the medical diagnosis. A Decision Support System (DSS) can
techniques), this is lower than its sensitivity. In our future plans, we be integrated to analyse the patient's data, and smartly detect poten-
want improve this limitation of our methodology, optimizing the spe- tially abnormal conditions with respect to a set of clinical guidelines
cifics of our classifier through a wider study involving a greater number specifically formulated and applied for each monitored subject. The
of people, possibly using samples selected from those freely available integration in an m-health system allows an optimization of our pro-
and comparing the classification accuracy with other deep learning and posed methodology, personalizing the correct classification and mon-
non-linear methods. itoring, based on the distinctive elements of the service user, and con-
Recent studies in automated carotid diseases detection summarized sidering the relationship between the carotid disease and physiological
in Table 1 and discussed in Section 2, yield highly accurate classifica- data characteristic of each subject, such as gender, age, weight and
tion results. It is important to note that the main studies existing in height, and the main risk factors, such as high cholesterol, high blood
literature have evaluated different signals to detect the presence of a pressure, smoking, lack of exercise and diabetes.
carotid disease. They, in fact, have evaluated Doppler signals, a possible
source of diagnosis errors, while we have estimated HRV parameters Acknowledgements
from opportune ECG signals to detect the possible presence of a carotid
disease. These parameters are easy to estimate and constitute accurate The authors would like to acknowledge Prof. Luigi Romano, of the

233
L. Verde and G. De Pietro Computers in Biology and Medicine 109 (2019) 226–234

Department of Technology, University of Naples Parthenope for his [26] M. Ceylan, R. Ceylan, F. Dirgenali, S. Kara, Y. Özbay, Classification of carotid artery
technical contribution. Doppler signals in the early phase of atherosclerosis using complex-valued artificial
neural network, Comput. Biol. Med. 37 (1) (2007) 28–36.
[27] D. Samiappan, V. Chakrapani, Classification of carotid artery abnormalities in ul-
References trasound images using an artificial neural classifier, Int. Arab J. Inf. Technol. 13
(6A) (2016) 756–762.
[28] H. Xiao, A. Avolio, D. Huang, A novel method of artery stenosis diagnosis using
[1] D. Högberg, D. Dellagrammaticas, B. Kragsterman, M. Björck, A. Wanhainen,
transfer function and support vector machine based on transmission line model: a
Simplified ultrasound protocol for the exclusion of clinically significant carotid
numerical simulation and validation study, Comput. Methods Progr. Biomed. 129
artery stenosis, Ups. J. Med. Sci. 121 (3) (2016) 165–169.
(2016) 71–81.
[2] E.G. Grant, C.B. Benson, G.L. Moneta, A.V. Alexandrov, J.D. Baker, E.I. Bluth,
[29] Zephyr Bioharness 3.0 User Manual, (2012) [Online; accessed https://www.
B.A. Carroll, M. Eliasziw, J. Gocke, B.S. Hertzberg, S. Katanick, L. Needleman,
zephyranywhere.com/media/download/bioharness3-user-manual.pdf , Accessed
L. Pellerito, J. Polak, K. Rholl, D. Wooster, E. Zierler, Carotid artery stenosis: gray-
date: 18 May 2018.
scale and Doppler ultrasonography diagnosis-society of radiologists in ultrasound
[30] M. A. Hall, Correlation-based Feature Subset Selection for Machine Learning.
consensus conference, Radiology 229 (2) (2003) 340–346.
[31] R.S.S. Kumari, S. Bharathi, V. Sadasivam, Design of optimal discrete wavelet for ecg
[3] E.Y. Lui, A.H. Steinman, R.S. Cobbold, K.W. Johnston, Human factors as a source of
signal using orthogonal filter bank, Iccima, IEEE, 2007, pp. 525–529.
error in peak Doppler velocity measurement, J. Vasc. Surg. 42 (5) (2005) 972–e1.
[32] B. Nassiri, R. Latif, A. Toumanari, S. Elouaham, F. Maoulainine, Ecg signal de-
[4] D.-Y. Kwon, H.E. Lim, M.H. Park, K. Oh, S.-W. Yu, K.-W. Park, W.-K. Seo, Carotid
noising and compression using discrete wavelet transform and empirical mode
atherosclerosis and heart rate variability in ischemic stroke, Clin. Auton. Res. 18 (6)
decomposition techniques, Int. J. Num. Anal. Methods Eng. 1 (5) (2013) 245–252.
(2008) 355–357.
[33] S. Cuomo, G. De Pietro, R. Farina, A. Galletti, G. Sannino, A revised scheme for real
[5] C.L. Kaufman, D.R. Kaiser, J. Steinberger, D.R. Dengel, Relationships between heart
time ecg signal denoising based on recursive filtering, Biomed. Signal Process.
rate variability, vascular function, and adiposity in children, Clin. Auton. Res. 17
Control 27 (2016) 134–144.
(3) (2007) 165–171.
[34] J. Pan, W.J. Tompkins, A real-time qrs detection algorithm, IEEE Trans. Biomed.
[6] F. Shaffer, J. Ginsberg, An overview of heart rate variability metrics and norms,
Eng. (3) (1985) 230–236.
Frontiers in public health 5 (2017) 1–17.
[35] M.P. Tarvainen, J.-P. Niskanen, J.A. Lipponen, P.O. Ranta-Aho, P.A. Karjalainen,
[7] A.N. Londhe, M. Atulkar, Heart rate variability analysis: application overview, 2018
Kubios hrv–heart rate variability analysis software, Comput. Methods Progr.
Second International Conference on Inventive Communication and Computational
Biomed. 113 (1) (2014) 210–220.
Technologies (ICICCT), IEEE, 2018, pp. 1518–1523.
[36] I. The MathWorks, Matlab and Statistics Toolbox Release, Natick, Massachusetts,
[8] G. Sannino, G. De Pietro, A deep learning approach for ecg-based heartbeat clas-
United States, 2017.
sification for arrhythmia detection, Future Gener. Comput. Syst. 86 (2018)
[37] D. Baldassarre, M. Amato, A. Bondioli, C.R. Sirtori, E. Tremoli, Carotid artery in-
446–455.
tima-media thickness measured by ultrasonography in normal clinical practice
[9] G. Sannino, P. Melillo, S. Stranges, G. De Pietro, L. Pecchia, Blood pressure drop
correlates well with atherosclerosis risk factors, Stroke 31 (10) (2000) 2426–2430.
prediction by using hrv measurements in orthostatic hypotension, J. Med. Syst. 39
[38] S. Pam, K. Dakok, U. Sirisena, E. Gadong, N. Chagok, B-mode Doppler ultrasound
(11) (2015) 143.
blood flow velocity measurements for the determination of the intima media
[10] A.F. Hussein, M. Burbano-Fernandez, G. Ramírez-González, E. Abdulhay, V.H.C. De
thickness in the human carotid artery, J. Biol. Agri. Health. 5 (14) (2015) 108–115.
Albuquerque, et al., An automated remote cloud-based heart rate variability mon-
[39] G. Chandrashekar, F. Sahin, A survey on feature selection methods, Comput. Electr.
itoring system, IEEE Access 6 (2018) 77055–77064.
Eng. 40 (1) (2014) 16–28.
[11] S. Naddeo, L. Verde, M. Forastiere, G. De Pietro, G. Sannino, A real-time m-health
[40] I. Guyon, A. Elisseeff, An introduction to variable and feature selection, J. Mach.
monitoring system: an integrated solution combining the use of several wearable
Learn. Res. 3 (Mar) (2003) 1157–1182.
sensors and mobile devices, HEALTHINF, 2017, pp. 545–552.
[41] I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning
[12] G. Litjens, T. Kooi, B.E. Bejnordi, A.A.A. Setio, F. Ciompi, M. Ghafoorian, J.A. van
Tools and Techniques, Morgan Kaufmann, 2016.
der Laak, B. Van Ginneken, C.I. Sánchez, A survey on deep learning in medical
[42] T.M. Mitchell, et al., Machine Learning, wcb, 1997.
image analysis, Med. Image Anal. 42 (2017) 60–88.
[43] H.S. Hippert, C.E. Pedreira, R.C. Souza, Neural networks for short-term load fore-
[13] D.S. Kermany, M. Goldbaum, W. Cai, C.C. Valentim, H. Liang, S.L. Baxter,
casting: a review and evaluation, IEEE Trans. Power Syst. 16 (1) (2001) 44–55.
A. McKeown, G. Yang, X. Wu, F. Yan, et al., Identifying medical diagnoses and
[44] K. Levenberg, A method for the solution of certain non-linear problems in least
treatable diseases by image-based deep learning, Cell 172 (5) (2018) 1122–1131.
squares, Q. Appl. Math. 2 (2) (1944) 164–168.
[14] L. Verde, G. De Pietro, G. Sannino, Voice disorder identification by using machine
[45] P. Melillo, R. Izzo, A. Orrico, P. Scala, M. Attanasio, M. Mirra, N. De Luca,
learning techniques, IEEE Access 6 (2018) 16246–16255.
L. Pecchia, Automatic prediction of cardiovascular and cerebrovascular events
[15] U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, M. Adam, A. Gertych, R. San Tan, A
using heart rate variability analysis, PLoS One 10 (3) (2015) e0118504.
deep convolutional neural network model to classify heartbeats, Comput. Biol. Med.
[46] A.L. Goldberger, L.A. Amaral, L. Glass, J.M. Hausdorff, P.C. Ivanov, R.G. Mark,
89 (2017) 389–396.
J.E. Mietus, G.B. Moody, C.-K. Peng, H.E. Stanley, Physiobank, physiotoolkit, and
[16] L. Verde, G. De Pietro, A Machine Learning Approach for Carotid Diseases Using
physionet: components of a new research resource for complex physiologic signals,
Heart Rate Variability Features, (2018), pp. 658–664.
Circulation 101 (23) (2000) e215–e220.
[17] H.G. Lee, K.Y. Noh, K.H. Ryu, A data mining approach for coronary heart disease
[47] T. Nezu, N. Hosomi, S. Aoki, M. Matsumoto, Carotid intima-media thickness for
prediction using hrv features and carotid arterial wall thickness, BioMedical
atherosclerosis, J. Atheroscler. Thromb. 23 (1) (2016) 18–31.
Engineering and Informatics, 2008. BMEI 2008. International Conference on vol. 1,
[48] E. Alpaydin, Introduction to Machine Learning, MIT press, 2014.
IEEE, 2008, pp. 200–206.
[49] S.F. Rosario, K. Thangadurai, Relief: feature selection approach, Int. J. Innovat. Res.
[18] H. Kim, M.I.M. Ishag, M. Piao, T. Kwon, K.H. Ryu, A data mining approach for
Develop. 4 (11) (2015) 218–224 2015.
cardiovascular disease diagnosis using heart rate variability and images of carotid
[50] B. Schölkopf, C.J. Burges, A.J. Smola, Advances in Kernel Methods: Support Vector
arteries, Symmetry 8 (6) (2016) 47.
Learning, MIT press, 1999.
[19] W.-S. Kim, S.-H. Jin, Y. Park, H.-M. Choi, A study on development of multi-para-
[51] J. Platt, Sequential Minimal Optimization: A Fast Algorithm for Training Support
metric measure of heart rate variability diagnosing cardiovascular disease, World
Vector Machines, (1998).
Congress on Medical Physics and Biomedical Engineering 2006, Springer, 2007, pp.
[52] S.L. Salzberg, C4. 5: programs for machine learning by j. ross quinlan. morgan
3480–3483.
kaufmann publishers, inc. Mach. Learn. 16 (3) (1994) 235–240.
[20] A.D. Dolatabadi, S.E.Z. Khadem, B.M. Asl, Automated diagnosis of coronary artery
[53] R.C. Holte, Very simple classification rules perform well on most commonly used
disease (cad) patients using optimized svm, Comput. Methods Progr. Biomed. 138
datasets, Mach. Learn. 11 (1) (1993) 63–90.
(2017) 117–126.
[54] G.H. John, P. Langley, Estimating continuous distributions in bayesian classifiers,
[21] T.Y. Wah, R. Gopal Raj, U. Iqbal, et al., Automated Diagnosis of Coronary Artery
Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence,
Disease: a Review and Workflow, Cardiology Research and Practice, (2018).
Morgan Kaufmann Publishers Inc., 1995, pp. 338–345.
[22] F. Latifoglu, S. Şahan, S. Kara, S. Güneş, Diagnosis of atherosclerosis from carotid
[55] A. Watkins, J. Timmis, L. Boggess, Artificial immune recognition system (airs): an
artery Doppler signals as a real-world medical application of artificial immune
immune-inspired supervised learning algorithm, Genet. Program. Evolvable Mach.
systems, Expert Syst. Appl. 33 (3) (2007) 786–793.
5 (3) (2004) 291–317.
[23] S. Özşen, S. Kara, F. Latifoğlu, S. Güneş, A new supervised classification algorithm
[56] C.G. Atkeson, A.W. Moore, S. Schaal, Locally weighted learning for control, Lazy
in artificial immune systems with its application to carotid artery Doppler signals to
Learning, Springer, 1997, pp. 75–113.
diagnose atherosclerosis, Comput. Methods Progr. Biomed. 88 (3) (2007) 246–255.
[57] D.W. Aha, D. Kibler, M.K. Albert, Instance-based learning algorithms, Mach. Learn.
[24] H. Uğuz, Detection of carotid artery disease by using learning vector quantization
6 (1) (1991) 37–66.
neural network, J. Med. Syst. 36 (2) (2012) 533–540.
[58] T.G. Dietterich, Ensemble methods in machine learning, International Workshop on
[25] F. Dirgenali, S. Kara, Recognition of early phase of atherosclerosis using principles
Multiple Classifier Systems, Springer, 2000, pp. 1–15.
component analysis and artificial neural networks from carotid artery Doppler
signals, Expert Syst. Appl. 31 (3) (2006) 643–651.

234

You might also like