You are on page 1of 3

THE OPEN SOURCE MATLAB TOOLBOX Gait-CAD

AND ITS APPLICATION TO BIOELECTRIC SIGNAL PROCESSING


R. Mikut, O. Burmeister, S. Braun, M. Reischl
Institute for Applied Computer Science, Forschungszentrum Karlsruhe GmbH, Germany
E-Mail ralf.mikut@iai.fzk.de

Abstract In this paper, the open source Matlab toolbox analysis of time series and features, especially for classifi-
Gait-CAD is presented. This toolbox is designed for the cation, but also for regression problems. Our intention is
visualization and analysis of time series and single features the design of an open platform as a framework for the
with a special focus to classification problems. The aim is development and improvement of data mining methods.
to provide an open platform for the development and im-
provement of data mining methods and the application to Methods
various medical and technical problems.
The toolbox Gait-CAD bases on Matlab (tested for the
Keywords Data Mining, Tools, Neuroprostheses
versions 5.3 and 2007b). The decision to a Matlab-based
solution was made to use the wide mathematical function-
Introduction ality of this package provided by The Mathworks Inc. A
In many applications, large data sets of time series and main disadvantage is the need for a MATLAB license.
single features are recorded. An at least semi-automatic The toolbox is operated by a graphical user interface
search for unknown or partially known relations requires (GUI) with menu items and control elements like popup
the use of data mining methods [1]. In the last years, a huge lists, checkboxes, and edit elements (Figure 1). This en-
number of potentially useful methods and software tools ables inexperienced users to work with the toolbox. How-
have been proposed including methods for feature extrac- ever, the implemented algorithms work independently
tion, classification, and regression. from the GUI. Thus, the Matlab-typical way of program-
Many existing software tools are very powerful, but they ming using a command prompt and variables is possible.
cover only a very limited subset of implemented methods. Furthermore, an automation and batch standardization of
However, the coupling between different necessary proc- analyzes is possible by designing individual macros. More
essing steps (as e.g. feature extraction from time series and details for the handling are explained in a comprehensible
classification) is rather weak. This leads often to the reim- PDF handbook.
plementation of existing methods or a stepwise transfer of
partial results between different tools.
Some tools are focused on a script-based processing result-
ing in problems for a transfer to other applications due to a
time-consuming manual adaptation of implemented algo-
rithms. A generally accepted tool platform does not exist at
the moment.
These facts make a fast comparison of new developed
methods against a broader set of existing methods very
time consuming. As a consequence, the new methods will
only be compared with a small number of concurrent ap-
proaches - a broad comparison is not feasible.
In our opinion, an ideal data mining tool
• has to contain various data mining methods from
feature extraction to classification and regression us-
ing statistical approaches up to newer approaches
from computational intelligence,
• has to be free and open source to guarantee a wide Figure 1: Gait-CAD screenshot
acceptance in the scientific community and the fast in-
tegration of new methods, Gait-CAD is an open source software. The German version
• needs to be modular with well documented interfaces is available since November 2006, the English one since
to integrate various methods useful for highly special- January 2008 It is licensed under the conditions of the
ized application domains, and GNU General Public License (GNU-GPL) of The Free
• has to support a GUI based exploration of the data set Software Foundation. The download is possible using the
as well as a highly automated script based processing downloading section at
of routine operations.
This paper presents the Matlab toolbox Gait-CAD as a first http://www.iai.fzk.de/projekte/biosignal/index.html.
step in this direction. It is focused on the visualization and
To use the toolbox for the design of a data mining algo-
rithm, a training data set is required. This data set is nor- Database
Problem formulation
(verbalized)
mally given by a binary Matlab project file, containing
matrices and vectors with predefined structures and names. Collecting Problem formulation
This data set is normally given by a binary Matlab project training data set (formalized)

file, containing matrices with given names. Additionally,


the user is able to add own textual identifiers and further Evaluation
measures
Validation
strategies
Visualization

information to the matrices and structures. Missing infor-


mation is compensated by standard values and identifiers.
Data point Feature Feature Feature Classification/
The import of data from text files (single files or complete selection extraction selection aggregation Regression
directories, single features or time series) is possible.
The training data set is organized with n = 1, ..., N data Design of a data mining method (Gait-CAD)
points, each containing
• sz time series (described by a matrix with the dimen-
sion sz × K, with K - number of sample points), Figure 2: Design process of a data mining algorithm [2]
• s single features (vector with the dimension s) They generate
• sy discrete output variables (vector with the dimension • new time series from one (e.g. by low-pass or high-
sy). pass filtering, segmentation) or more (e.g. minimum,
The management of multiple output variables (i.e. diagno- mean or maximum value) existing time series, or
ses with respect to diseases in medical applications, deci- • new single features from one time series in a pre-
sions for therapies, qualitative evaluations of therapy suc- defined segment (e.g. mean value for the complete
cesses, gender, age-groups etc.) for each data point allows time series or the first 50% of sampling points). The
a flexible selection of multiple classification problems. segment can be defined by a special file or interac-
Additionally, input and output variables may be switched tively by selecting a region of interest.
depending on the problem. Gait-CAD contains a large number of pre-defined plugins
Gait-CAD implements the standardized data mining and segments. The structure allows a user-defined expan-
process proposed by [2]. The main components are shown sion with special feature types for each specific application
in Figure 2. Gait-CAD permits a comfortable handling of field.
numerous algorithms for the Macros are recorded sequences of clicked menu items and
• selection of data points (e.g. detection of outliers, control elements. The main advantages are an automation
discarding of incomplete data points and features, se- of long sequences of operations (e.g. for the use in different
lection of parts of data sets), projects) and the opportunity for the integration of user-
• feature extraction (e.g. spectrograms, FFT analysis, defined functions. A manual modification is possible due
correlation analysis, linear filtering, calculation of ex- to its textual Matlab syntax.
trema, mean values, fuzzification etc.), Application-specific extension packages can be easily
• evaluation and selection of features and time series integrated into the graphical user interface. Gait-CAD
(e.g. multivariate analysis of variances, t-test, informa- contains templates for new menu items and control ele-
tion measures, regression analysis), ments as a starting point for a manual modification. It
• feature aggregation (e.g. discriminant analysis, princi- allows the integration of own functions using any parame-
pal component analysis - PCA, independent compo- ter from the control elements or available variables. An
nent analysis - ICA), example is a special package for electroneurography pro-
• supervised and unsupervised classification (e.g. deci- vided by the University of Freiburg. It contains the algo-
sion trees, cluster algorithms, Bayes classifier, artifi- rithms described in [3].
cial neural networks (ANN), nearest neighbour algo-
rithms, support vector machines - SVM, fuzzy sys- Results
tems), and
• validation strategies (e.g. cross-validation, bootstrap). In many clinical applications, the available data set con-
Additionally, there are various possibilities to visualize tains time series of recorded bioelectric signals such as
results, automatically log results and process steps in text muscle, nerve, or brain signals.
and LaTeX files, rename variables etc. The automatic design of data mining solutions offers an
For some functions, Gait-CAD uses additional commercial objective and reliable method for the generation of hy-
Matlab toolboxes (e.g. Signal, Statistics, Neural Network, potheses for clinical trials, the data-based design of clinical
and Wavelet toolbox from the MathWorks, Inc.) or freely decision support systems for diagnosis and therapy plan-
available GNU-GPL toolboxes. But most of the self- ning, and the adaptation of medical devices to individual
implemented functions require only a standard Matlab patients.
installation. An example for the latter task is the detection of user inten-
The feature extraction is realized with plugins. Plugins tions from brain, nerve or muscle signals or the informa-
are single Matlab functions called plugin_*.m, which are tion processing of nerve signals from natural limbs for
included in a special directory or in the working directory. neuroprostheses (Figure 3).
Intentions Neural Interface
Interface
Artificial Acknowledgements
Protheses
Central Software
Data
Nervous
System
Sensor analysis Interface Thanks to all the busy programmers, developers of algo-
Software
Control
Pattern
rithms, and testers, especially to Tobias Loose, and Sebas-
Stimu- Natural
Stimu-
Stimu-
lator
lator
Pattern
generator
generator lator Limbs tian Gollmer. The support by the Deutsche Forschungsge-
Data
analysis Sensor meinschaft (German research association) within the pro-
Feedback
ject "Diagnosis support in gait analysis" and the Cooperate
Figure 3: Interface for the design of neuroprostheses [4] Research Center "Humanoid Robots" was a great help to
build the basis for the further development of the toolbox.
Table 1: Examples for recent applications of Gait-CAD to
bioelectric signals (EMG: electromyography, ENG: elec- References
troneurography, EEG: electroencephalography, ECoG:
electrocorticography) [1] Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P.: From
Data Mining to Knowledge Discovery in Databases.
Applications Signals AI Magazine, Vol. 17, pp. 37–54, 1996.
Hand prosthesis control [5] EMG [2] Mikut, R.; Reischl, M.; Burmeister, O.; Loose, T.:
Detection of mechanical stimuli from nerve ENG Data Mining in Medical Time Series. Biomediz-
signals with cuff electrodes [6] inische Technik, vol. 51, pp. 288–293, 2006.
Detection of artefacts from Function Electri- EMG, [3] Krüger, T. B.; Levchuk, O.; Stieglitz, T.: Decoding of
cal Stimulation (FES) [7] ENG Neural Signals with MATLAB - Onset Detection and
Analysis of Central Pattern Generators [8] EMG Classification as a Guided Tool. Biomedizinische
Design algorithms for Brain Computer EEG, Technik, vol. 52, Ergänzungsband, 2007
Interfaces [5, 9] ECoG [4] Mikut, R.; Krüger, T.; Reischl, M.; Burmeister, O.;
Gait analysis [10] EMG Rupp, R.; Stieglitz, T.: Regelungs- und Steuerungs-
konzepte für Neuroprothesen am Beispiel der oberen
Data analysis plays a key role in this concept for the data- Extremitäten. at - Automatisierungstechnik, vol. 54,
based detection of human intentions from bioelectric sig- pp. 523–536, 2006.
nals and for the use of biosensors. Gait-CAD has supported [5] Reischl, M.: Ein Verfahren zum automatischen
these steps for a number of different scenarios: Entwurf von Mensch-Maschine-Schnittstellen am
For the first task, Brain Computer Interfaces are often Beispiel myoelektrischer Handprothesen. Disserta-
controlled by imagined movements. The brain signals can tion, Universität Karlsruhe, Universitätsverlag
be recorded by surface (EEG) or invasive (ECoG) elec- Karlsruhe. 2006.
trode arrays resulting in a set of time series. The data min- [6] Krüger, T.; Reischl, M.; Lago, N.; Burmeister, O.;
ing task consists of the extraction new time series (e.g. by Mikut, R.; Ruff, R.; Hoffmann, K.-P.; Navarro, X.;
bandpass filters) and a classification to differentiate the Stieglitz, T.: Analysis of Microelectrode-Signals in
movement intentions. In addition, an analysis of the local the Peripheral Nervous System, In-Vivo and Post-
and temporal information content is useful to understand Processing. In: Proc., Mikrosystemtechnik Kongress
the processes [9]. Hand prostheses are usually controlled Deutschland, pp. 69–72. Freiburg: VDE-Verlag.
by muscle signals originating from two electrodes. Here, 2005.
classification problems exist for the switching between [7] Rohm, M.: Evaluierung und Inbetriebnahme von
different grasp types [5]. Sensorkonzepten für die Steuerung von funktionellen
For future neuroprostheses, a scenario including functional Orthesen der oberen Extremität. Diplomarbeit, Uni-
electro stimulation and a recording of afferent nerve signals versität Darmstadt, Forschungszentrum Karlsruhe.
induced by mechanical stimuli is intended. The nerve 2008.
signals are recorded by cuff electrodes. Here, very high [8] Chen, Y.: A Concept for the Application of Neural
sampling frequencies (50 kHz) are necessary to extract Oscillators and Spinal Reflexes to Humanoid Robots
useful information. The problem is the detection and local- and Neuroprostheses. Diplomarbeit, Universität
ization of mechanical stimuli by a classification task [6]. Karlsruhe (TH), in preparation, 2008.
Besides these applications, Gait-CAD is now used in many [9] Burmeister, O.; Reischl, M.; Mikut, R.: Application
medical, biological, and technical application scenarios. of Time-Variant Classifiers to Invasively Recorded
From a data mining point of view, these very different Signals from Brain and Peripheral Nerve. Biomediz-
applications can be unified and the synergies can be used inische Technik, vol. 52, Ergänzungsband, 2007.
with the presented platform. [10] Wolf, S.; Loose, T.; Schablowski, M.; Döderlein, L.;
Rupp, R.; Gerner, H. J.; Bretthauer, G.; Mikut, R.:
Discussion Automated feature assessment in instrumented gait
analysis. Gait & Posture, 23 (3), S. 331-338; 2006
The aim of Gait-CAD is to provide an interface to apply
and compare data mining methods. Its architecture allows
to enlarge the toolbox by further algorithms. Everyone is
invited to support the further development of Gait-CAD.

You might also like