You are on page 1of 12

REVI EW ARTI CLE

Advances in laryngeal imaging


Antanas Verikas Virgilijus Uloza
Marija Bacauskiene Adas Gelzinis
Edgaras Kelertas
Received: 5 December 2008 / Accepted: 7 July 2009 / Published online: 19 July 2009
Springer-Verlag 2009
Abstract Imaging and image analysis became an
important issue in laryngeal diagnostics. Various tech-
niques, such as videostroboscopy, videokymography, dig-
ital kymograpgy, or ultrasonography are available and are
used in research and clinical practice. This paper reviews
recent advances in imaging for laryngeal diagnostics.
Keywords Larynx Image analysis High-speed video
Digital kymography Videostroboscopy
Glottal area detection
Introduction
Attempts to examine the human larynx date back more than
150 years [1]. In 1854 Manuel Garcia, a vocal music tea-
cher, examined his own larynx using a dental mirror and a
hand mirror, and published his observations in 1855 [1].
Only two years latter, Turck, a neurologist from Vienna,
and Czermak, a physiologist from Prague, re-invented the
same technique and demonstrated its use in medicine [1].
Imaging technologies applicable in laryngology developed
enormously since then.
Nowadays, the diagnostic procedure of laryngeal dis-
eases in clinical practice is rather complex and is based on
the evaluation of patients complaints, history, and data of
instrumental as well as histological examinations. During
the last two decades a variety of imaging techniques for the
examination of the larynx and obtaining objective mea-
surements of voice quality have been developed [2, 3].
Evaluation of larynx has improved signicantly with the
establishment of computer tomography (CT) and magnetic
resonance imaging (MRI), as these technologies provide
insights into the endoscopically blind areas and reveal the
depth of tumor inltration [4]. These technologies may be
benecial in staging laryngeal carcinoma and planning the
most appropriate surgical procedure [58]. Ultrasonogra-
phy is useful in cases of larger laryngeal lesions and may
have some role in screening for unilateral vocal fold
pathologies. At the same time, further ne-tuning of the
technique may be necessary [9, 10].
Laryngeal still images, image sequences, voice signal,
and patients questionnaire data can be considered as the
main information sources to characterize human larynx.
Nowadays, automated analysis of voice is increasingly
used for detecting and screening laryngeal pathologies
[1118]. It was demonstrated that even telephone-based
voice records may be used for screening of laryngeal
disorders [13]. It was also demonstrated that even ques-
tionnaire data can be an important source of information
in screening for laryngeal disorders [19]. There were very
few attempts to create systems for automated analysis of
still laryngeal images. More work has been done in the
eld of analysis of image sequences obtained from
A. Verikas (&)
Intelligent Systems Laboratory, Halmstad University,
Box 823, 301 18 Halmstad, Sweden
e-mail: antanas.verikas@hh.se
A. Verikas M. Bacauskiene A. Gelzinis E. Kelertas
Department of Applied Electronics,
Kaunas University of Technology,
Studentu 50, 51368 Kaunas, Lithuania
e-mail: marija.bacauskiene@ktu.lt
A. Gelzinis
e-mail: adas.gelzinis@ktu.lt
V. Uloza
Department of Otolaryngology,
Kaunas University of Medicine, 50009 Kaunas, Lithuania
e-mail: virgilijus.uloza@kmuk.lt
1 3
Eur Arch Otorhinolaryngol (2009) 266:15091520
DOI 10.1007/s00405-009-1050-4
high-speed cameras, laryngeal videostroboscopy or
videokymography.
Different tasks can be pursued with laryngeal imaging.
Obtaining images of high quality and resolution, revealing
the detailed morphology of the glottal structures, is one of
the main tasks. Such images are essential in making correct
diagnosis and choosing the treatment. Quantitative char-
acterization of the vibratory behavior of the vocal folds is
another important task. Characterization in the form of a
rather small set of physically interpretable parameters is
desired. Several indices, describing the glottal wave form,
are usually used for the characterization. Figure 1 presents
three laryngeal images recorded from the same patient with
cyst in the left vocal fold. The two video-laryngo-strobo-
scopic images on the left-hand side of Fig. 1 provide
mainly the functional information about the vibratory
function of the vocal folds and the glottal closure. The
image shown on the right-hand side of Fig. 1 was obtained
via direct microlaryngoscopy, and provides more detailed
information about structural changes in the vocal fold and
morphological peculiarities of the cyst which is funda-
mental in making the correct diagnosis. This image was
used to quantitatively characterize the distribution of color,
the geometry of the edges of the vocal folds, and the
texture.
Specic tasks are often pursued by applying ultrasonic
imaging. Autouorescence laryngoscopy often focuses on
revealing early malignant changes, while CT and MRI help
to reveal the depth of tumor inltration. It is noteworthy
that other methods, such as inverse ltering and electro-
glottography (EEG), are also often used to examine the
vocal fold vibrations and to calculate glottal wave form
parameters [20, 21]. The aim of this work was to review the
state-of-the-art imaging and image analysis techniques
used in laryngoscopy. 3D imaging technologies, such as
CT or MRI, were not considered in this review.
Analysis of still images
An image analysis-based technique for automated catego-
rization of manually marked suspect lesions into the
healthy and diseased classes has been proposed [22]. The
categorization is based on textural features extracted from
co-occurrence matrices [23, 24] computed from manually
marked areas of vocal fold images, taken by a CCD cam-
era. The classication accuracy of 81.4% was reported
when testing the technique on a very small set of 35 ima-
ges. A set of 785 color laryngeal images obtained via direct
microlaryngoscopy has been used in studies presented in
literature [25, 26]. The classication accuracy of over 94%
was achieved when categorizing the images into one
healthy and two pathological (nodular and diffuse) classes.
Figure 2 presents characteristic examples from the three
decision classes considered. As can be seen from Fig. 2,
examples representing the three classes are rather dissim-
ilar. However, it is worth noting that due to a large variety
of appearance of vocal fold mass lesions, classication can
sometimes be difcult even for a trained physician [27].
Figure 3 exemplies some results of the automated anal-
ysis. Images were automatically segmented into regions of
homogenous color. The region boundaries found are shown
with thin black curves. Two lines in the image on the right-
hand side of Fig. 3 delineate the polyp. Lines in the other
two images approximate the edges of vocal folds. As can
be seen, one of the lines was misplaced. When categorizing
the same set of images into seven classes (one healthy and
six pathological), the classication accuracy of over 80%
was reported [28]. Image texture, distribution of color, and
geometry of edges of vocal folds were the types of features
used for the categorization. The investigation showed that
color was among the most discriminative types of features.
Hanson et al. [29] used color, an average value of the
normalized red component r given by Eq. 1, to quantify the
degree of erythema
r
R
R G B
1
where R, G, and B are three components of the color
images recorded by the color CCD camera. Five different
areas were manually selected from each laryngeal image to
estimate the r component. The value of the r component
computed for normal subjects was compared to the r
component values computed for patients with chronic lar-
yngitis. The examination showed that the r values for
patients with chronic posterior laryngitis were signicantly
higher than the r values computed for normal larynges. It is
worth noting that the camera was color-balanced before
Fig. 1 Images obtained from
the same patient by
videostroboscopy (left and
middle) and direct
microlaryngoscopy (right)
1510 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3
each recording. However, variations in illumination,
geometry, and appearance of vocal folds have not been
taken into consideration.
Imaging is successfully used in indirect autouores-
cence and uorescence (5-ALA-induced PPIX, for exam-
ple) laryngoscopy [3034]. Autouorescence laryngoscopy
is based on the fact that normal cells emit green uores-
cence when exposed to blue light, while precancerous or
cancerous lesions display a signicant loss of green uo-
rescence and appear reddish. Autouorescence is usually
induced by ltered blue light of a xenon lamp or light of a
blue laser. A color CCD camera is usually used for
recording. Autouorescence and uorescence imaging
helps to detect borderlines between tumors and healthy
tissue. It is argued that autouorescence endoscopy shows
greater sensitivity in the detection of precancerous or
cancerous lesions, compared to standard laryngomicros-
copy [32]. However, the image analysis procedures used
are limited to image visualization. Quantication of color,
texture, and shape of lesions and normal tissue could help
in more accurate categorization of lesions as well as in the
follow-up procedures.
Hsiao et al. [35, 36] and Shau et al. [37] used color
Doppler ultrasound imaging to assess laryngeal phonation
function non-invasively. Vocal fold mucosal wave velocity
and elasticity of vocal folds were quantied on the basis of
color Doppler imaging. The mean vocal fold displacement
velocity was assessed from the pseudocolor codes of the
ultrasound image. A commercially available high resolu-
tion ultrasound scanner was used for imaging. The authors
pointed out that the technique was applicable for the
evaluation of the phonation function affected by vocal
disorders that alter the vibration of the mucosa. Ultrasonic
imaging was also used by Niimi et al. [38] to obtain
quantitative measurements of the vertical movement of the
larynx. A linear type ultrasonic tomograph with 128
ultrasound emitting elements aligned on a straight line
served as an imaging device, ensuring a spatial resolution
of 1 mm.
In [39], image analysis-based procedures were used to
make relative measurements of some laryngeal parameters.
Analysis of image sequences
Videostroboscopy
Videostroboscopy is one of the standard methods used to
examine moving objects. Laryngo-videostroboscopy is
used extensively for the analysis of vocal folds and in the
clinical practice for diagnosing voice disorders [4044].
For example, videostroboscopy is a well-established tech-
nique for measuring the glottal gap or examining the glottic
closure [41, 45, 46]. Laryngo-videostroboscopy is able to
reveal a number of abnormalities, including abnormalities
of laryngeal structure, absence of vibration, and vibratory
asymmetry. Flashing light is used to illuminate an object in
stroboscopy. When the ashes are synchronized with the
vocal fold vibrations, a stationary view of the vocal folds is
obtained.
However, the single-ash-timing laryngeal videostro-
boscopy has a limitation that it is effective only when vocal
fold vibrations exhibit only one single fundamental fre-
quency. Multiple tones (fundamental frequencies) may be
recorded in the presence of some diseases, such as polyps,
nodules, or cysts [47]. In such cases, a clear view of the
vibrating vocal folds cannot be obtained with the single-
ash-timing laryngeal videostroboscopy. A multiple-ash-
timing laryngeal videostroboscopy technique was proposed
by Deguchi et al. [47], in order to deal with such cases.
Multiple light emitting diodes (LED) are used as illumi-
nation sources.
Fig. 2 Images from the nodular
(left), diffuse (middle), and
healthy (right) classes
Fig. 3 Images illustrating the
results of automated analysis
Eur Arch Otorhinolaryngol (2009) 266:15091520 1511
1 3
Schade et al. [48] proposed using the short-interval
color-ltered double strobe ash-stroboscopy to obtain
quantitative measurements of vocal fold movements. With
this technique a double exposure is accomplished in each
video frame. Thus, information about the vocal fold
velocity is available from a single image obtained with this
technique.
Among other issues, the usefulness of laryngeal trans-
illumination in combination with stroboscopy for auto-
mated endoscopic images-based quantitative measurement
of glottal area function was studied in [49]. The results
showed that the utilization of color information (available
due to transillumination) facilitates automated quantitative
measurements of the glottal area function and the detection
of the glottal contour area. In [50] image sequences
recorded with the stroboscopy system were used to mea-
sure the glottic angle and the angular velocities of vocal
fold abduction and adduction. The authors indicated that
semi-automated edge tracking would be an important
improvement of the technique.
In clinical practice, decision-making is quite often based
on subjective evaluation of video data. Quantitative mea-
sures of motion and geometry of vocal folds can provide
objective information and may be useful in planning
medical treatment and tracing progress over time. Video-
strobokymography introduced in [45] creates multiple
kymograms (multi-line kymography) based on multiple
lines of interest from recorded successive frames of stro-
boscopic video images, and enables calculation of some
quantitative parameters. Two quantitative parameters,
namely open quotient and asymmetry index, are calculated
from video-strobokymograms. To facilitate the analysis,
images are rotated to make the glottal axis vertical before
creating a kymogram. To create a kymogram from a
sequence of images, oneand always the sameline is
taken from each image. The lines are then combined into a
new image showing the time course of the fold vibrations.
Figure 4 shows an example of a video-strobokymogram
obtained with the EndoSTROB device (XION medical).
A wave of mucous membrane and a clear closure phase are
well recognizable.
Videokymography
A special CCD camera-based video system for videoky-
mography was developed [51, 52]. The system can function
in two modes. In the rst mode, the system functions as an
ordinary video camera. In the secondline-imaging
mode, videokymographic images are recorded. Thus, vocal
fold vibration frequency and amplitude, leftright asym-
metry, and open and closed phases of the glottal cycle can
be quantied. One drawback of the system is that only a
single kymogram can be obtained during one examination.
A new-generation videokymographic system has been
recently introduced [53, 54]. Two CCD sensors are used to
provide simultaneously a laryngoscopic and kymographic
image for clinical vocal fold examination. A color CCD
camera is used for obtaining a laryngoscopic image, while
a high-speed line-scan CCD is used to capture a kymo-
graphic image at a selected position, which is shown on the
laryngoscopic image. Since the color CCD camera is not
fast enough, the laryngoscopic image appears blurred,
while the kymographic image is of high quality, high
signal-to-noise-ration and high spatial and temporal
resolution.
In [55], an approach to quantitative characterization of
videokymographic images was presented. An active con-
tour model-based algorithm tracks the vibration wave in a
videokymographic image, and quanties its properties in
terms of parameters. The parameters considered are: the
amplitude and period ratios between the right and left vocal
fold, and the ratio between the opening and the closing
phases. Verdonck-de Leeuw et al. [56] studied the effect of
irregular vocal fold vibration on voice quality by simulta-
neously recording and analyzing videokymographic images
and speech signals.
Digital kymography
Vocal fold vibrations occurring during voice production
can be observed with a relatively high spatial and temporal
resolution by applying high-speed imaging techniques [57].
Digital high-speed imaging systems were introduced in
1980s [5861]. A solid endoscope is usually attached to a
high-speed camera and used to illuminate the target and to
collect the reected light. In another imaging setup, sug-
gested by researchers from the University of Tokyo, a
exible berscope inserted through the nasal cavity is used
[60, 62, 63]. Such system allows for examining the vocal
Fig. 4 An example of a video-strobokymogram obtained with the
EndoSTROB device (courtesy: XION medical)
1512 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3
fold vibrations during running speech including conso-
nants. However, images recorded using a solid endoscope-
based imaging system are of higher quality [62].
At present, videokymography and high-speed imaging
(also referred to as high-speed video or high-speed glot-
tography) are the only available techniques for directly
studying aperiodic vibrations of vocal folds [64]. Such
studies are usually performed by creating and analyzing
kymograms. Kymograms obtained (using the technique
explained in Videostroboscopy) from a high-speed video
are usually called digital kymograms. Kymography meth-
ods allow for the condensation and quantication of
information generated by a high-speed imaging system. It
is evident that the analysis of the huge amount of infor-
mation available from high-speed imaging may help cli-
nicians in gaining a better insight on the effects various
treatments have on vocal folds [65].
By using appropriate image analysis techniques a
parametric description of the vocal fold dynamics can be
obtained from digital kymograms. Digital images lend
themselves for calculating various quantitative measures
characterizing the vocal fold dynamics. Parametric
description of vocal fold dynamics is important in clinical
investigation of laryngeal pathologies, and is usually done
by nding the glottal area via image segmentation and
plotting the glottal area as a function of timecomputing
the glottal area waveform (GAW). This approach is often
criticized on the basis that asymmetries between the
oscillation dynamics of the left and the right vocal fold are
not taken into consideration [66]. Single-line or multi-line
kymography [67], evaluating the time-dependent distance
of a specic point/points on the vocal fold edge to the
longitudinal glottal axis (experimental 1D trajectory/tra-
jectories), overcomes this drawback. Different symmetry/
asymmetry measures can be used to assess the oscillations
of vocal folds [68]. For example, vocal fold vibration
irregularities were studied in [69] based on high-speed
video and digital kymography. Two types of irregularities
were studied: a vibration frequency difference between
the right and left vocal folds (horizontal asymmetry) and
the vertical asymmetry, measured on one side between the
ventral and dorsal fold third. The aim was to obtain inde-
pendent quantitative measurements for voice therapy
planing and control. However, the symmetry measures did
not incorporate details about the time-varying shape of the
vocal fold edges [66].
Lohscheller et al. [70] identied the following main
requirements for an image analysis system designed to
investigate vocal fold vibrations based on high-speed video:
1. robust segmentation of glottis;
2. edge identication for the left and the right vocal fold,
aiming to exploit this information for distinguishing
between the inuences of the left and right vocal fold
on the laryngeal dynamics (asymmetry of the oscilla-
tion directions, velocities, and amplitudes);
3. high edge detection accuracy;
4. high computation speed;
5. minimal manual intervention.
One advantage of high-speed video endoscopy over
videokymography is the possibility to remove errors caused
by involuntary motion of the endoscope. For example,
Deliyski [71, 72] presents an image analysis-based tech-
nique to compensate for endoscopic motion in high-speed
video. The technique is based on the assumption that the
difference between the endoscopic and vocal fold dynam-
ics is sufcient to accurately estimate the endoscopic
motion in relation to the vocal folds.
Though expensive, high-speed video endoscopy systems
are commercially available. However, extraction, quanti-
cation, and presentation of essential diagnostic informa-
tion to the clinician in an effective interpretable form are
still rather problematic [71]. Accurate extraction of glottal
area, compact quantitative representation of information
available from kymograms, accuracy of measurements, and
calibration are issues important for all the techniques dis-
cussed above. Therefore, we review works related to these
issues in the following sections.
Detecting glottal area
Simple thresholding techniques are often used in the image
segmentation process to extract the glottal area [66, 73
75]. Such image thresholding often results in error, espe-
cially when the image quality is not high. False and mul-
tiple vocal fold contours may appear in different frames.
Figure 5 presents several examples of images taken from
the same image sequence recorded with a laryngo-video-
stroboscope. It is evident that due to illumination variation
and a large variability of the vocal fold appearance, simple
thresholding techniques will often fail, especially in cases
when the opening is narrow. Determining the presence of
glottal area in an image automatically is not a trivial issue.
One has also bear in mind the fact that vocal folds can be
occluded by other structures. In [76], a semi-automatic
technique based on gray level derivative is applied to
detect the glottal edge. When parameters of the technique
are determined, the edge detection and calculation of the
glottal area is performed automatically.
More advanced techniques, like region growing [77, 78],
active contour models or snakes, are sometimes applied
for extracting the glottal area [79, 80]. The approach
proposed in [77] is based on image thresholding and
morphological operations followed by region-growing.
A histogram-based image thresholding is applied,
Eur Arch Otorhinolaryngol (2009) 266:15091520 1513
1 3
assuming that the image histogram follows the Rayleigh
distribution. The thresholding technique is similar to that of
Otsu [81], except that the threshold is determined by
minimizing the pixel classication error instead of maxi-
mizing the between-cluster variance (when partitioning the
histogram into two clusters) in the Otsus technique. The
image processing technique developed in [70] for extract-
ing vocal fold edges from high-speed video sequences is
based on the seeded region-growing algorithm. Having the
edges, the so-called vibration trajectories characterizing
deections of the vocal folds perpendicular to the glottal
main axis are dened and used in the analysis. The success
of the image segmentation process highly depends on the
choice of an appropriate region homogeneity criterion. The
technique requires the manual denition of one or several
seed points. The manual intervention enables to achieve a
high processing speed amounting to 98 images/s. In total
186,000 images were analyzed. The authors claim that the
segmentation accuracy exceeds the one obtained by human
experts.
Active contour models are based on the minimization of
some energy function, usually exploiting information on
gray-level gradient. Additional information expressing our
prior knowledge can also be incorporated into such func-
tions [80]. In [82], the active contour-based algorithm has
been used to detect the pseudo-glottis and track its defor-
mations within a high-speed image sequence of the vibra-
tion pattern of the substitute voice generating element. The
algorithm is initialized with a region of interest extracted
by a region-growing algorithm. Active contour models,
however, are rather time-consuming. A rather generic
glottal area detection algorithm has been recently proposed
in [83]. The algorithm is based on several image processing
techniques, and does not require any initialization.
Characterizing vibrations of vocal folds quantitatively
Recently, high-speed imaging combined with advanced
image analysis is considered as the most promising
approach to explore the relation between voice disorders
and vibrations of vocal folds [61, 70, 8486]. However,
there are no standard procedures to characterize vocal fold
vibrations and in clinical practice high-speed videos are
largely interpreted subjectively. To characterize the
vibrations of vocal folds, several positions on the vocal fold
edges are selected, usually dorsal, medial, and ventral [87]
(see Fig. 6), and trajectories (experimental trajectories) for
these points are extracted and used for further character-
ization. Often, only the medial trajectory is utilized [88
90]. GAW is used instead of the trajectories sometimes.
The regularity and symmetry of vocal fold vibrations are
usually characterized quantitatively by computing various
measures such as amplitude and phase asymmetry, and
instability of the fundamental frequency [69, 87, 91].
Yan et al. [74, 92] have recently suggested using the
Hilbert transform-based characterization. To describe the
experimental trajectory u(t), the complex signal /(t) is
dened as
/t ut jHfutg 2
where j is the imaginary unit and H{u(t)} stands for the
Hilbert transform of u(t):
Hfutg
1
p
Z
1
1
us
s t
ds 3
Thus, the Hilbert transform is obtained by convolving
u(t) with a time function {-(1/pt)}, i.e. by ltering with a
Fig. 5 Examples of images taken from the same image sequence recorded with a laryngo-videostroboscope
Fig. 6 A frame of vocal folds obtained by videostroboscopy
1514 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3
90-phase-shift lter or quadrature lter. Having the
Hilbert transform, the envelope r(t) [88] or the Nyquist
plot [92] are computed to characterize the dynamics of
vocal folds
rt

fR/tg
2
fI/tg
2
q
4
where R[] and I[] stand for the real and the imaginary part
of the complex signal.
In [92], the Nyquist plot-based analysis is used to
characterize the dynamics of vocal folds. The approach is
applied to both the normalized glottis area waveform
extracted from high-speed digital images and the acoustic
signal. The Nyquist plot or the analytic phase trace maps an
instantaneous trace of the vibratory signal. The phase trace
repeats itself from period to period for a periodic signal.
Scattering in the Nyquist plot is used as a measure of
irregularity of the vocal fold vibrations.
An interesting approach to the characterization of vocal
fold vibrations has been recently proposed by Lohscheller
et al. [84, 93]. The authors suggest representing vocal fold
vibrations by a phonovibrogram (PVG)a 2D image of
distances of vocal fold edges from the glottal main axis.
The distances are measured at M points equidistantly
spread over the main axis between the posterior and the
anterior glottal ending points. One image frame generates
one column in the 2D image. The image consists of
2M ? 1 rowsone middle point concatenating M distance
values to the edges of the left and the right vocal fold,
respectively. PVG images are considered as vocal fold
vibration ngerprints, as a specic vibration pattern creates
a specic geometrical form within a PVG. The authors
suggest characterizing the PVG geometry by ten regression
lines represented by a set of ten angles.
Granqvist and Lindestad [94] proposed a Fourier analy-
sis-based technique for estimating and visualizing vibration
frequencies in the entire laryngeal area. Vibration frequen-
cies can be evaluated in a specic position dened by an
image pixel or a line in an image, as well as in all pixel
positions. To visualize oscillations, the estimated frequen-
cies are superimposed on an image frame by coloring pixels
corresponding to the oscillating parts of the larynx.
Quantitative characterization in multi-line kymography
Multi-line digital kymography is performed by using many
cross-section lines of the glottis simultaneously [65, 67, 69,
95, 96]. Multi-line kymography is used to reveal glottic
movements differing between the anterior and posterior
commissure. Multi-line kymography is the usual way to
represent the temporal and spatial information on the
vibration amplitude of vocal fold edges. However, such
technique of the representation of the vibration amplitude
is rather redundant and apt to be compromised by mea-
surement errors easily [97]. Liu et al. [97] presented rather
an efcient way to represent information obtained during
multi-line kymography. The authors suggested collecting
the fold vibration amplitude measurements a
kl
made at
different time moments l = 1, 2,..., L and in different
locations k = 1, 2,..., K into a K 9 L matrix A. The matrix
A can be represented as
A
X
K
i1
r
i
u
i
v
T
i
5
with u
i
, v
i
being the left and the right singular vectors of A
and r
i
is the corresponding singular value. The vectors u
i
and v
i
reect spatial and temporal behavior of the fold
vibrations and are called by the authors the spatial and the
temporal eigenfolds. The authors demonstrated that the rst
spatial eigenfold reects the average shape of the vocal
folds, the second eigenfold captures the closing pattern of
the folds, while the third spatial eigenfold reects the
motion of the vocal folds in the longitudinal direction. The
rst temporal eigenfold captures the vibration amplitude
differences between the left and right fold. It is worth
noting that when approximating A with M\K B L sin-
gular vectors, an optimal approximation is obtained in the
sum of the squared error sense.
Combining imaging and other analysis techniques
A solid endoscope-based high-speed imaging system is
limited to studies of vocal fold vibrations in sustained
phonation. To study the vocal fold vibrations in the sus-
tained phonation and in the production of consonants in
running speech, Kiritani et al. [61, 62] used both the solid
endoscope and the berscope-based digital high-speed
imaging systems. High-speed image sequences, speech
waveforms, and EGG signals were utilized. Sakakibara
et al. [63] used a berscope-based high-speed imaging
system to study throat singing. High-speed image sequen-
ces, sound and EGG waveforms, and inverse ltering were
utilized in the study. In [58, 98], high-speed image
sequences were compared with synchronized speech
waveforms and EGG signals. Koster et al. [99] synchro-
nously recorded a high-speed image sequence, sound and
EGG and used these three modalities to study voice onset.
The same three modalities have also been used by Bailly
et al. [100] to study the ventricular band vibration and its
interaction with the vocal fold vibration. Henrich et al. [21]
synchronized a high-speed image sequence with an EGG
signal and investigated the suitability of the differentiated
EGG signal for determining the glottal opening and clo-
sure. Granqvist et al. [101], aiming to study the relation-
ship between the vocal fold vibrations and the associated
trans-glottal airow, recorded high-speed video, EGG, oral
Eur Arch Otorhinolaryngol (2009) 266:15091520 1515
1 3
airow, oral air pressure, and sound pressure (audio). The
glottal area extracted from the high-speed image sequence
was compared with inverse ltered airow.
In [102], when studying Mongolian throat singing,
the kymographic images were manually compared frame
by frame to the sound signal and the high-speed images.
According to Wittenberg et al. [95], an important appli-
cation of simultaneous analysis of high-speed video and
audio signals is searching for glottalizations in the speech
signal. Hertegard and Gaufn [20], by using EGG, inverse
ltered ow waveform, and videostroboscopy, demon-
strated the correspondence between the phases of the
glottal vibration cycle observed in the image sequence and
the inverse ltered ow waveform.
Accuracy of measurements and calibration
Accuracy of measurements of vocal folds on videostrobo-
scopic images was studied in [103]. This included the
investigation of the magnitude of the error of measurement
of the surface areas. The glottal area and the area of the
lesion were studied. The study emphasized the necessity of
correction for differences in magnication and view angle
between images of two recordings, to make a correct
comparison between the images. It was also pointed out
that the measurement error is mainly due to the uncertainty
of the outline of areas with a fuzzy boundary.
To obtain absolute metric spatial dimensions of laryngeal
structures, a high-speed camera systemwas combined with a
laser projection system [104, 105]. The dimensions of lar-
yngeal structures were determined on digital images. Image
segmentation was applied to extract the glottal area. The
precision of the measurements was limited by non-orthog-
onality of the glottal plane to the optical axis, not exactly
parallel the two laser beams used, and the relatively low
spacial resolution of the camera [104].
A Radon/Hough transform-based technique to calibrate
high-speed image sequences aiming to obtain vocal fold
measurements in a metric scale was recently proposed
[106]. As in [104, 105], an endoscope is equipped with a
laser projection device projecting two parallel lines with a
known distance between them onto the vocal folds. The
distance is the information used for the calibration. The
Radon/Hough transform maps 2D images in such a way
that lines manifest themselves as peaks. The peaks are
detected using image processing procedures.
Discussion
Advanced image analysis procedures are used in laryngeal
diagnostics with increasing frequency. Nonetheless, the
image analysis procedures used are quite often limited to
image visualization, for example, in indirect autouores-
cence laryngoscopy. Quantication of color, texture, and
shape of lesions and normal tissue could help in a more
accurate categorization of lesions as well as in follow-up
procedures.
High temporal and spacial-horizontal resolution is
achieved in the new generation videokymography system.
One limitation of videokymography is that vibration of the
anterior and posterior parts of vocal folds cannot be
observed simultaneously. One more drawback of video-
kymograpy is the fact that errors caused by the movements
of the endoscope cannot be corrected after recording. In
digital kymograpgy, by contrast, image analysis-based
procedures are available for removing the errors.
High-speed imaging systems are still too expensive to be
widely used in clinical practice. However, regarding the
investigation of vocal fold vibrations, it seems that high-
speed imaging is the most promising approach. Currently,
the low spatial resolution often amounting to 256 9 256 or
less pixels is identied as the main technical limitation of
high-speed imaging systems [70, 107]. A spatial resolution
of at least 1,024 9 1,024 pixels is desired. Higher temporal
resolution would also be very helpful.
The most common temporal resolution of 2,0004,000
frames/s, currently used in research (in combination with
the spatial resolution of 256 9 256 pixels), is still not high
enough for accurate examination of vocal fold vibrations,
especially for phonations with a high fundamental fre-
quency. For fundamental frequency of 200 Hz and higher,
only a few images are obtained for the open phase of a
glottal area pulse. Thus, the precision of the parameters
calculated from the image sequences is rather limited. Due
to the limited precision and the uncertainty arising in
the glottal area detection, it can be difcult to identify the
source of small variations in values of the parameters. The
variations may occur due to the measurement uncertainty
as well as the treatment applied.
High-speed imaging and digital kymography allow for
extracting numerous image data-based parameters (fea-
tures) that can provide detailed characterization of the
temporal behavior of different parts of the glottis. The
characterization and discrimination power of the parame-
ters can be explored by applying powerful data analysis
techniques. The parameters may enable automated dis-
crimination between various laryngeal disorders and sev-
eral stages of disorders. However, large data sets are
required to assess the validity and reliability of the
parameters. Since high-speed laryngeal imaging systems
are not widely used in clinical practice, gathering large data
sets is rather problematic. Moreover, there are no generally
accepted clinical protocols regarding the use of high-speed
imaging data in laryngology.
1516 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3
Currently, high-speed imaging is black and white.
Future color high-speed imaging will provide even more
information for studying laryngeal disorders. However,
color imaging requires an increased frame rate and an
increased amount of light. An increased amount of light
may lead to burns in the patients oral cavities.
The accuracy and robustness of glottal area detection is
an important issue in quantitative characterization of vocal
fold vibrations, since the robustness of the characterization
is dependent on the robustness of the glottal area detection.
Due to variation in illumination and a large variability of
the appearance of vocal folds, simple thresholding tech-
niques often fail, especially in cases when the glottal
opening is narrow. Therefore, it is difcult to correctly
determine the time moment of gap opening and closure. To
achieve the desired accuracy, semi-automatic glottal area
detection procedures are often utilized, meaning that glottal
area detection techniques sometimes require manual setting
of some parameter values. Therefore, the results of area
detection can be highly sensitive to the operators choices.
The results of glottal area detection are also dependent on
the distance between the vocal folds and the endoscope. If
procedures for the compensation of endoscope motion are
not applied, the absolute values of the glottal area cannot
be compared.
Due to a large number of image frames usually recorded
in one sequence during high-speed imaging, fully auto-
mated procedures are highly desirable. Fully automated
procedures, however, require very robust glottal area
detection. This is a complicated task. Due to partial
occlusion of vocal folds by other laryngeal structures, it is
sometimes even impossible to estimate the glottal area
from the images. Advanced data analysis techniques can be
used to provide a value of some certainty measure
regarding glottal area measurement results for a given
image frame. Frames with low certainty values could be
checked manually or skipped, if possible. Thus, high spa-
tial resolution and advanced image analysis techniques can
increase the glottal area detection accuracy and robustness,
and reduce the necessity of manual human involvement.
Acknowledgments We acknowledge the support from The Agency
for International Science and Technology Development Programmes,
Lithuania (COST Action 2103). We acknowledge very helpful
comments from the anonymous reviewer.
Conict of interest statement There is no conict of interest.
References
1. Alberti PW (1996) The history of laryngology: a centennial
celebration. Otolaryngol Head Neck Surg 114:345354
2. Mafee MF, Valvassori GE, Becker M (2005) Imaging of the
neck and head, 2nd edn. Thieme, Stuttgart
3. Uloza V, Saferis V, Uloziene I (2005) Perceptual and acoustic
assessment of voice pathology and the efcacy of endolaryngeal
phonomicrosurgery. J Voice 19:138145
4. Gallivan KH, Gallivan GJ (2002) Bilateral mixed laryngoceles:
simultaneous strobovideolaryngoscopy and external video
examination. J Voice 16:258266
5. Rumboldt Z, Gordon L, Ackermann RBS (2006) Imaging in
head and neck cancer. Curr Treat Options Oncol 7:2334
6. Rufng S, Struffert T, Reith AGW (2005) Imaging diagnostics
of the pharynx and larynx. Radiologe 45:828836
7. Hasso AN, Tang T (1994) Magnetic resonance imaging of the
pharynx and larynx. Top Magn Reson Imaging 6:224240
8. Hoorweg JJ, Kruijt RH, Heijboer RJ, Eijkemans MJ, Kerrebijn
JD (2006) Reliability of interpretation of CT examination of the
larynx in patients with glottic laryngeal carcinoma. Arch Oto-
laryngol Head Neck Surg 135:129134
9. Rubin JS, Lee S, McGuinness J, Hore I, Hill D, Berger L
(2004) The potential role of ultrasound in differentiating
solid and cystic swellings of the true vocal fold. J Voice
18:231235
10. Schade G, Kothe C, Leuwer R (2003) Sonography of the lar-
ynxan alternative to laryngoscopy? HNO 51:585590
11. Boyanov B, Hadjitodorov S (1997) Acoustic analysis of path-
ological voices. A voice analysis system for the screening of
laryngeal diseases. IEEE Eng Med Biol Mag 16:7482
12. Hadjitodorov S, Mitev P (2002) A computer system for acoustic
analysis of pathological voices and laryngeal diseases screening.
Med Eng Phys 24:419429
13. Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-
based voice pathology assessment using automated speech
analysis. IEEE Trans Biomed Eng 53:468477
14. Umapathy K, Krishnan S, Parsa V, Jamieson DG (2005) Dis-
crimination of pathological voices using a time-frequency
approach. IEEE Trans Biomed Eng 52:421430
15. Hadjitodorov S, Boyanov B, Teston B (2000) Laryngeal
pathology detection by means of class-specic neural maps.
IEEE Trans Inf Technol Biomed 4:6873
16. Godino-Llorente JI, Gomez-Vilda P (2004) Automatic detection
of voice impairments by means of short-term cepstral parame-
ters and neural network based detectors. IEEE Trans Biomed
Eng 51:380384
17. de Oliveira Rosa M, Pereira JC, Grellet M (2000) Adaptive
estimation of residue signal for voice pathology diagnosis. IEEE
Trans Biomed Eng 47:96104
18. Gelzinis A, Verikas A, Bacauskiene M (2008) Automated
speech analysis applied to laryngeal disease categorization.
Comput Methods Programs Biomed 91:3647
19. Verikas A, Gelzinis A, Bacauskiene M, Uloza V, Kaseta M
(2009) Using the patients questionnaire data to screen laryngeal
disorders. Comput Biol Med 39:148155
20. Hertegard S, Gaufn J (1995) Glottal area and vibratory patterns
studied with simultaneous stroboscopy, ow glottography, and
electroglottography. J Speech Hear Res 38:85100
21. Henrich N, dAlessandro C, Doval B, Castellengo M (2004) On
the use of the derivative of electroglottographic signals for
characterization of nonpathological phonation. J Acoust Soc Am
115:13211332
22. Ilgner JFR, Palm C, Schutz AG, Spitzer K, Westhofen M,
Lehmann TM (2003) Colour texture analysis for quantitative
laryngoscopy. Acta Otolaryngol 123:730734
23. Haralick RM, Shanmugam K, Dinstein I (1973) Textural fea-
tures for image classication. IEEE Trans Syst Man Cybern
3:610621
Eur Arch Otorhinolaryngol (2009) 266:15091520 1517
1 3
24. Gelzinis A, Verikas A, Bacauskiene M (2007) Increasing the
discrimination power of the co-occurrence matrix-based fea-
tures. Pattern Recognit 40:23672372
25. Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2006)
Towards a computer-aided diagnosis system for vocal cord
diseases. Artif Intell Med 36:7184
26. Verikas A, Gelzinis A, Valincius D, Bacauskiene M, Uloza V
(2007) Multiple feature sets based categorization of laryngeal
images. Comput Methods Programs Biomed 85:257266
27. Poels PJP, de Jong FICS, Schutte HK (2003) Consistency of the
preoperative and intraoperative diagnosis of benign vocal fold
lesions. J Voice 17:425433
28. Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2006) Inte-
grating global and local analysis of colour, texture and geo-
metrical information for categorizing laryngeal images. Intern J
Pattern Recognit Artif Intell 20:11871205
29. Hanson DG, Jiang J, Chi W (1998) Quantitative color analysis
of laryngeal erythema in chronic posterior laryngitis. J Voice
12:7883
30. Arens C, Reussner D, Woenkhaus J, Leunig A, Betz CS, Glanz
H (2007) Indirect uorescence laryngoscopy in the diagnosis of
precancerous and cancerous laryngeal lesions. Eur Arch
Otorhinolaryngol 264:621626
31. Arens C, Dreyer T, Glanz H, Malzahn K (2004) Indirect auto-
uorescence laryngoscopy in the diagnosis of laryngeal cancer
and its precursor lesions. Eur Arch Otorhinolaryngol 261:7176
32. Baletic N, Petrovic Z, Pendjer I, Malicevic H (2004) Autou-
orescent diagnostics in laryngeal pathology. Eur Arch Otorhi-
nolaryngol 261:233237
33. Csanady M, Kiss JG, Ivan L, Jori J, Czigner J (2004) ALA
(5-aminolevulinic acid)-induced protoporphyrin IX uorescence
in the endoscopic diagnostic and control of pharyngo-laryngeal
cancer. Eur Arch Otorhinolaryngol 261:262266
34. Zargi M, Fajdiga I, Smid L (2000) Autouorescence imaging in
the diagnosis of laryngeal cancer. Eur Arch Otorhinolaryngol
257:1723
35. Hsiao TY, Wang CL, Chen CN, Hsieh FJ, Shau YW (2001)
Noninvasive assessment of laryngeal phonation function using
color doppler ultrasound imaging. Ultrasound Med Biol
27:10351040
36. Hsiao TY, Wang CL, Chen CN, Hsieh FJ, Shau YW (2002)
Elasticity of human vocal folds measured in vivo using color
doppler imaging. Ultrasound Med Biol 28:11451152
37. Shau YW, Wang CL, Hsieh FJ, Hsiao TY (2001) Noninvasive
assessment of vocal fold mucosal wave velocity using color
doppler imaging. Ultrasound Med Biol 27:14511460
38. Niimi S, Matsunaga A, Hirose H (1992) Ultrasonic observation
of the vertical movement of the larynx during phonation. In:
Proceedings of international congress on acoustics, vol 14,
Beijing, pp 13171318
39. Goncalves MI, Leonard R (1998) A hardware-software system
for analysis of video images. J Voice 12:143150
40. Popolo PS, Titze IR (2008) Qualication of a quantitative lar-
yngeal imaging system using videostroboscopy and videoky-
mography. Ann Otol Rhinol Laryngol 117:404412
41. Rihkanen H, Reijonen P, Lehikoinen-Soderlund S, Lauri ER
(2004) Videostroboscopic assessment of unilateral vocal fold
paralysis after augmentation with autologous fascia. Eur Arch
Otorhinolaryngol 261:177183
42. Lee JS, Kim IE, Sung MW, Kim KH, Sung MY, Park KS (2001)
A method for assessing the regional vibratory pattern of vocal
folds by analysing the video recording of stroboscopy. Med Biol
Eng Comput 39:273278
43. Woo P, Colton RH, Casper JK, Brewer DW (1991) Diagnostic
value of stroboscopic examination in hoarse patients. J Voice
5:231238
44. Colton RH, Woo P, Brewer DW, Grifn B, Casper J (1995)
Stroboscopic signs associated with benign lesions of the vocal
folds. J Voice 9:312325
45. Sung MW, Kim KH, Koh TY, Kwon TY, Mo JH, Choi SH, Lee
JS, Park KS, Kim EJ, Sung MY (1999) Videostrobokymogra-
phy: a new method for the quantitative analysis of vocal fold
vibration. Laryngoscope 109:18591863
46. Sulter AM, Schutte HK, Miller DG (1996) Standardized lar-
yngeal videostroboscopic rating: differences between untrained
and trained male and female subjects, and effects of varying
sound intensity fundamental frequency, and age. J Voice
10:175189
47. Deguchi S, Ishimaru Y, Washio S (2007) Preliminary evaluation
of stroboscopy system using multiple light sources for obser-
vation of pathological vocal fold oscillatory pattern. Ann Otol
Rhinol Laryngol 116:687694
48. Schade G, Hess M, Muller F, Kirchhoff T, Ludwigs M, Hillman
R, Kobler J (2002) Physical and technical elements of short-
interval, color-ltered double strobe ash-stroboscopy. HNO
50:10791083
49. Hess MM, Ludwigs M (2000) Strobophotoglottographic trans-
illumination as a method for the analysis of vocal fold vibration
patterns. J Voice 14:255271
50. Dailey SH, Kobler J, Hillman RE, Tangrom K, Thananart E,
Mauri M, Zeitels SM (2005) Endoscopic measurement of vocal
fold movement during adduction and abduction. Laryngoscope
115:178183
51. Svec JG, Schutte HK (1996) Videokymography: high-speed line
scanning of vocal fold vibration. J Voice 10:201205
52. Schutte HK, Svec JG, Sram F (1998) First results of clinical
application of videokymography. Laryngoscope 108:12061210
53. Qiu QJ, Schutte HK (2006) A new generation videokymography
for routine clinical vocal fold examination. Laryngoscope
116:18241828
54. Qiu QJ, Schutte HK (2007) Real-time kymographic imaging for
visualizing human vocal-fold vibratory function. Rev Sci
Instrum 78:16
55. Manfredi C, Bocchi L, Bianchi S, Migali N, Cantarella G (2006)
Objective vocal fold vibration assessment from videokymo-
graphic images. Biomed Signal Process Control 1:129136
56. de Leeuw IMV, Festen JM, Mahieu HF (2001) Deviant vocal
fold vibration as observed during videokymography: the effect
on voice quality. J Voice 15:313322
57. Eysholdt U, Tigges M, Wittenberg T, Proschel U (1996) Direct
evaluation of high-speed recordings of vocal fold vibrations.
Folia Phoniatr Logop 48:163170
58. Kiritani S, Honda K, Imagawa H, Hirose H (1986) Simultaneous
high-speed digital recording of vocal fold vibration and speech
signal. In: Proceedings of the IEEE international conference on
acoustics, speech, and signal processing, vol 11, Tokyo,
pp 16331636
59. Kiritani S, Imagawa H, Hirose H (1988) High-speed digital
image recording for the observation of vocal cord vibration. In:
Vocal physiology: voice production, mechanism, and functions.
Raven Press, New York, pp 261269
60. Kiritani S, Imagawa H, Hirose H (1990) Vocal cord vibration
and voice source characteristicsobservations by a high-speed
digital recording. In: Proceedings of the international confer-
ence on spoken language processing (ICSLP 90), Kobe,
pp 6164
61. Kiritani S, Hirose H, Imagawa H (1993) High-speed digital
image-analysis of vocal cord vibration in diplophonia. Speech
Commun 13:2332
62. Kiritani S, Hirose H, Imagawa H (1993) High-speed digital
image recording system for observing vocal cord vibration. Ann
Bull RILP 27:7987
1518 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3
63. Sakakibara KT, Imagawa H, Konishi T, Kondo K, Murano EZ,
Kumada M, Niimi S (2001) Vocal fold and false vocal fold
vibrations in throat singing and synthesis of khoomei. In:
Proceedings of the international computer music conference
2001, Havana, Cuba, International Computer Music Associa-
tion, pp 135138
64. Hertegard S (2005) What have we learned about laryngeal
physiology from high-speed digital videoendoscopy? Curr Opin
Otolaryngol Head Neck Surg 13:152156
65. Kendall KA, Browning MM, Skovlund SM (2005) Introduction
to high-speed imaging of the larynx. Curr Opin Otolaryngol
Head Neck Surg 13:135137
66. Schwarz R, Dollinger M, Wurzbacher T, Eysholdt U,
Lohscheller J (2008) Spatio-temporal quantication of vocal
fold vibrations using high-speed videoendoscopy and a biome-
chanical model. J Acoust Soc Am 123:27172732
67. Tigges M, Wittenberg T, Mergell P, Eysholdt U (1999) Imaging
of vocal fold vibration by digital multi-plane kymography.
Comput Med Imaging Graph 23:323330
68. Kim DY, Kim LS, Kim KH, Sung MW, Roh JL, Kwon TK, Lee
SJ, Choi SH, Wang SG, Sung MY (2003) Videostrobokymo-
graphic analysis of benign vocal fold lesions. Acta Otolaryngol
123:11021109
69. Eysholdt U, Rosanowski F, Hoppe U (2003) Vocal fold vibra-
tion irregularities caused by different types of laryngeal asym-
metry. Eur Arch Otorhinolaryngol 260:412417
70. Lohscheller J, Toy H, Rosanowski F, Eysholdt U, Dollinger M
(2007) Clinically evaluated procedure for the reconstruction of
vocal fold vibrations from endoscopic digital high-speed videos.
Med Image Anal 11:400413
71. Deliyski DD (2005) Endoscope motion compensation for lar-
yngeal high-speed videoendoscopy. J Voice 19:485496
72. Deliyski D, Petrushev P (2003) Methods for objective assessment
of high-speed videoendoscopy. In: Proceedings of the 6th inter-
national conference: advances in quantitative laryngology, voice
and speech research, AQL-2003, Hamburg, Germany, pp 116
73. Tao C, Zhang Y, Jiang JJ (2007) Extracting physiologically
relevant parameters of vocal folds from high-speed video image
series. IEEE Trans Biomed Eng 54:794801
74. Yan Y, Ahmad K, Kunduk M, Bless D (2005) Analysis of vocal-
fold vibrations from high-speed laryngeal images using a Hilbert
transform-based methodology. J Voice 19:161175
75. Yan Y, Bless D, Chen X (2005) Biomedical image analysis in
high-speed laryngeal imaging of voice production. In: Pro-
ceedings of the 2005 IEEE engineering in medicine and biology
27th annual conference, Shanghai, pp 76847687
76. Larsson H, Hertegard S, Lindestad PA, Hammarberg B (2000)
Vocal fold vibrations: high-speed imaging, kymography, and
acoustic analysis: a preliminary report. Laryngoscope
110:21172122
77. Yan Y, Chen X, Bless D (2006) Automatic tracing of vocal-fold
motion from high-speed digital images. IEEE Trans Biomed
Eng 53:13941400
78. Wittenberg T, Moser M, Tigges M, Eysholdt U (1995)
Recording, processing, and analysis of digital high-speed
sequences in glottography. Mach Vis Appl 8:399404
79. Allin S, Galeotti J, Stetten G, Dailey SH (2004) Enhanced snake
based segmentation of vocal folds. In: Proceedings of the IEEE
international symposium on biomedical imaging, Washington,
DC, IEEE, pp 812815
80. Marendic B, Galatsanos N, Bless D (2001) A new active contour
algorithm for tracking vocal folds. In: Proceedings of the IEEE
international conference on image processing, Thessaloniki, pp
397400
81. Otsu N (1979) A threshold selection method from gray-level
histograms. IEEE Trans Syst Man Cybern 9:6266
82. Lohscheller J, Dollinger M, Schuster M, Schwarz R, Eysholdt U,
Hoppe U (2004) Quantitative investigation of the vibration
pattern of the substitute voice generator. IEEE Trans Biomed
Eng 51:13941400
83. Osma-Ruiz V, Godino-Llorente JI, Saenz-Lechon N, Fraile R
(2008) Segmentation of the glottal space from laryngeal images
using the watershed transform. Comput Med Imaging Graph
32:193201
84. Lohscheller J, Eysholdt U, Toy H, Dollinger M (2008) Phono-
vibrography: mapping high-speed movies of vocal fold vibra-
tions into 2-d diagrams for visualizing and analyzing the
underlying laryngeal dynamics. IEEE Trans Med Imaging
27:300309
85. Braunschweig T, Schelhorn-Neise P, Dollinger M (2008)
Diagnosis of functional voice disorders by using the high speed
recording technics. Laryngorhinootologie 87:323330
86. Mortensen M, Woo P (2008) High-speed imaging used to detect
vocal fold paresis: a case report. Ann Otol Rhinol Laryngol
117:684687
87. Dollinger M, Braunschweig T, Lohscheller J, Eysholdt U,
Hoppe U (2003) Normal voice production: computation of
driving parameters from endoscopic digital high speed images.
Methods Inf Med 42:271276
88. Braunschweig T, Flaschka J, Schelhorn-Neise P, Dollinger M
(2008) High-speed video analysis of the phonation onset, with
an application to the diagnosis of functional dysphonias. Med
Eng Phys 30:5966
89. Dollinger M, Hoppe U, Hettlich F, Lohscheller J, Schuberth S,
Eysholdt U (2002) Vibration parameter extraction from endo-
scopic image series of the vocal folds. IEEE Trans Biomed Eng
49:773781
90. Schwarz R, Hoppe U, Schuster M, Wurzbacher T, Eysholdt U,
Lohscheller J (2006) High-precision measurement of the vocal
fold length and vibratory amplitudes. IEEE Trans Biomed Eng
53:10991108
91. Mergell P, Herzel HP, Titze IR (2000) Irregular vocal-fold
vibrationhigh speed observation and modeling. J Acoust Soc
Am 108:29963002
92. Yan Y, Damrose E, Bless D (2007) Automatic tracing of vocal-
fold motion from high-speed digital images. J Voice 21:604616
93. Lohscheller J, Doellinger M, McWhorter AJ, Kunduk M (2008)
Preliminary study on the quantitative analysis of vocal loading
effects on vocal fold dynamics using phonovibrograms. Ann
Otol Rhinol Laryngol 117:484493
94. Granqvist S, Lindestad PA (2001) A method of applying fourier
analysis to high-speed laryngoscopy. J Acoust Soc Am
110:31933197
95. Wittenberg T, Tigges M, Mergell P, Eysholdt U (2000) Func-
tional imaging of vocal fold vibration: digital multislice high-
speed kymography. J Voice 14:422442
96. Eysholdt U, Rosanowski F, Hoppe U (2003) Measurement and
interpretation of irregular vocal fold vibrations. HNO 51:710
716
97. Liu L, Galatsanos N, Bless D (2002) A new approach for
analysis of vibrating vocal folds. In: Proceedings of the IEEE
international symposium on biomedical imaging, Washington,
DC, pp 589592
98. Kiritani S, Niimi S, Imagawa H, Hirose H (1995) Vocal fold
vibrations associated with involuntary voice changes in certain
pathological cases. In: Vocal fold physiology: voice quality
control. Vocal Fold Physiology Series, Kurume, Japan, pp 269
281
99. Koster O, Marx B, Gemmar P, Hess MM, Ktinzel HJ (1999)
Qualitative and quantitative analysis of voice onset by means of
a multidimensional voice analysis system (mvas) using high-
speed imaging. J Voice 13:355374
Eur Arch Otorhinolaryngol (2009) 266:15091520 1519
1 3
100. Bailly L, Henrich N, Webb M, Muller F, Licht AK, Hess M
(2007) Exploration of vocal-folds and ventricular-bands inter-
action in singing using high-speed cinematography and elec-
troglottography. In: 19th international congress on acoustics,
Madrid, Spain, pp 16
101. Granqvist S, Hertegard S, Larsson H, Sundberg J (2003) Simul-
taneous analysis of vocal fold vibration and transglottal airow:
exploring a new experimental setup. J Voice 17:319330
102. Lindestad PA, Sodersten M, Merker B, Granqvist S (2001)
Voice source characteristics in mongolian throat singing
studied with high-speed imaging technique, acoustic spectra,
and inverse ltering. J Voice 15:7885
103. Svec JG, Schutte HK (2005) Accuracy of measurements on
digital videostroboscopic images of the vocal folds. Ann Otol
Rhinol Laryngol 114:443450
104. Schuberth S, Hoppe U, Dollinger M, Lohscheller J, Eysholdt U
(2002) High-precision measurement of the vocal fold length and
vibratory amplitudes. Laryngoscope 112:10431049
105. Schuster M, Lohscheller J, Kummer P, Eysholdt U, Hoppe U
(2005) Laser projection in high-speed glottography for high-
precision measurements of laryngeal dimensions and dynamics.
Eur Arch Otorhinolaryngol 262:477481
106. Wurzbacher T, Voigt I, Schwarz R, Dollinger M, Hoppe U,
Penne J, Eysholdt U, Lohscheller J (2008) Calibration of lar-
yngeal endoscopic high-speed image sequences by an automated
detection of parallel laser line projections. Med Image Anal
12:300317
107. Schade G, Muller F (2005) Physical and technical elements of
short-interval, color-ltered double strobe ash-stroboscopy.
HNO 53:10851091
1520 Eur Arch Otorhinolaryngol (2009) 266:15091520
1 3

You might also like