Professional Documents
Culture Documents
Predicting when someone will commit sui- that contribute to suicide risk is possible with
cide has been nearly impossible (American standardized, clinical tools when used by
Psychiatric Association, 2003; Goldstein, well-trained clinicians (Beck, Beck, &
Black, Nasrallah, & Winokur, 1991; Hughes, Kovacs, 1975; Beck, Kovacs, & Weissman,
1995; Large & Ryan, 2014; Olav Nielssen, 1979; Burk, Kurz, & M oller, 1985; Colum-
2012; Paris, 2006), but classifying the factors bia-Suicide Severity Rating Scale, 2015;
JOHN P. PESTIAN, Division of Biomedical OH, USA; JEFFRY T. GEE, Princeton Commu-
Informatics, Cincinnati Childrens Hospital nity Hospital, Princeton, WV, USA; LOUIS-
Medical Center, University of Cincinnati, PHILIPPE MORENCY, Language Technologies
Cincinnati, OH, USA; MICHAEL SORTER, Divi- Institute, Carnegie Mellon University, Pitts-
sion of Psychiatry, Cincinnati Childrens Hospi- burgh, PA, USA; STEFAN SCHERER, Institute for
tal Medical Center, University of Cincinnati, Creative Technologies, University of Southern
Cincinnati, OH, USA; BRIAN CONNOLLY, Divi- California, Los Angeles, CA, USA.
sion of Biomedical Informatics, Cincinnati Chil- *STM Research Group: Robert Faist,
drens Hospital Medical Center, University of Leslie Korbee, Lisa McCord, Rachel McCourt,
Cincinnati, Cincinnati, OH, USA; KEVIN and Colleen Murphy.
BRETONNEL COHEN, Computational Bioscience Address correspondence to John P.
Program, University of Colorado School of Pestian, Cincinnati Childrens Hospital Medical
Medicine, Denver, CO, USA; CHERYL MCCUL- Center, University of Cincinnati, 3333 Burnet
LUMSMITH, Department of Psychiatry, College of Ave., Cincinnati, OH 45229; E-mail: john.
Medicine, University of Cincinnati, Cincinnati, pestian@cchmc.org
2 SUICIDE THOUGHT MARKERS
Kovacs & Garrison, 1985; M uller & Drag- died by suicide and simulated suicide notes
icevic, 2003; Mundt, Greist, & Jefferson, written by age- and gender-matched con-
2013; Pokorny, 1983; Posner et al., 2008; trols better than mental health professionals
Preston & Hansen, 2005; U.S. Food & Drug could (71% vs. 79%; Pestian et al., 2008).
Administration, 2012; Winters, Myers, & In an international, shared task-setting that
Proud, 2002). Such tools can, however, be included multiple groups sharing the same
cumbersome and may not reliably translate task definition, data set, and scoring metric
into routine interactions between clinicians, (Voorhees et al., 2005), 24 teams developed
caregivers, or educators. Here, we describe a and tested computational algorithms to
novel approach using the subjects linguistic identify emotions in over 1,319 suicide
and acoustic patterns to classify subjects notes written shortly before death. The
automatically as either suicidal, mentally ill results showed that the fusion of multiple
but not suicidal, or a control. methods outperform single methods (Pes-
tian, Matykiewicz, & Linn-Gust, 2012).
Suicidal thought markers have also
BACKGROUND been studied prospectively. The Suicidal
Adolescent Clinical Trial (Pestian et al.,
Efforts to understand suicide risks 2015), the single-site precursor to this study,
can be roughly clustered into traits or which used machine learning to analyze
states. Trait analyses focus on stable charac- interviews with 60 suicidal and control
teristics rooted in and measured using bio- patients, classified patients into suicidal or
logical processes (Costanza et al., 2014; Le- control groups with greater than 90% accu-
Niculescu et al., 2013), whereas state analy- racy (Pestian et al., 2015). Analysis of acous-
ses measure dynamic characteristics like ver- tic features such as pauses and vowel spacing
bal and nonverbal communication, termed yielded similar results (Scherer, Morency,
thought markers (Pestian et al., 2015). Gratch, Pestian, & Playa Vista, 2015;
Machine learning and natural language pro- Venek, Scherer, Morency, Rizzo, & Pestian,
cessing have successfully identified differ- 2014). The study described herein is novel
ences in retrospective suicide notes, because it uses a multisite, multicultural set-
newsgroups, and social media (Gomez, ting to show that machine learning algo-
2014; Huang, Goh, & Liew, 2007; Matykie- rithms can be trained to automatically
wicz, Duch, & Pestian, 2009). Jashinsky identify the suicidal subjects in a group of
et al. (2015) used multiple annotators to suicidal, mentally ill, and control subjects.
identify the risk of suicide from the key- Moreover, the inclusion of acoustic charac-
words and phrases (interrater reliabil- teristics is most helpful when classifying
ity = .79) in geographically based tweets. between suicidal and mentally ill subjects.
Thompson, Poulin, and Bryan (2014) and
Desmet (2014) used text-based signals to
identify suicide risk that ranged from 60% METHODS
to 90%. Li, Ng, Chau, Wong, and Yip
(2013) presented a framework using Subject Enrollment
machine learning to identify individuals
expressing suicidal thoughts in web forums; Between October 2013 and March
Zhang et al. (2015) used microblog data to 2015, 379 subjects were enrolled from
build machine learning models that identi- emergency departments (EDs) and inpatient
fied suicidal bloggers with approximately and outpatient centers into a three-site,
90% accuracy. Pestian, Matykiewicz, and internal review board-approved prospective
Grupp-Phelan (2008) demonstrated that clinical trial. One hundred twenty-six
machine learning algorithms could distin- subjects were enrolled at Cincinnati Chil-
guish between notes written by people who drens Hospital Medical Center (CCHMC),
PESTIAN ET AL. 3
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
Specificity
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
Specificity
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
Specificity
Figure 1. Receiver operator curve (ROC): suicide versus control (upper), suicide versus mentally ill (middle), and
suicide versus mentally ill with control. The ROC curves for adolescents (blue), adults (red), and all subjects (black)
generated where the nonsuicidal population is controls (top), mentally ill (middle), and mentally ill and controls,
using linguistic and acoustic features. The gray line is the AROC curve for a baseline (random) classifier.
PESTIAN ET AL. 5
quasi-open quotient (Hacki, 1989), and Tibshirani, 1997; Molinaro, Simon, &
parabolic spectral parameters (Alku, Strik, Pfeiffer, 2005). Data were analyzed by
& Vilkman, 1997); the vocal tract reso- members of the study team.
nance frequencies, as well as pause lengths
characterized by formants F1 through F5
(Kwon, Chan, Hao, & Lee, 2003); and RESULTS
pause lengths that have been correlated
with depression (Cummins et al., 2015). Subjects
Once a subset of all features were
extracted using the COVAREP software A total of 955 patients were approached
(Degottex, Kane, Drugman, Raitio, & for enrollment. Of that group, 576 subjects
Scherer, 2014), they were normalized by did not meet the inclusion criteria (n = 70),
adjusting the measured values from the refused to participate (n = 436), or were
various features to a common zero-to-one excluded for other reasons (n = 70). This
scale (Dodge, 2006). resulted in 379 subjects enrolled, comprising
Machine Learning Algorithms. A goal 130 suicidal patients, 126 nonsuicidal
of machine learning is to train a computa- patients with mental illness, and 123 con-
tional model from selected data that can trols. Eight subject interviews were incom-
then generalize to unseen (test) data. plete and were excluded from the final
Machine learning can be roughly divided analysis. A total of 371 subjects completed
into three types: supervised learning, when the study. The patient demographic charac-
the training data are already labelled; semi- teristics within each study site are shown in
supervised learning, when only part of the Table 1.
training set is labelled; and unsupervised
learning, when the challenge is to learn Comparison of the Performance of
structure in unlabelled data. Here, a super- Classification Algorithms
vised learning support vector machine
(SVM) approach was used (Sch olkopf & Figure 1 and Table 2 show the per-
Smola, 1998). formance of the machine learning algorithm
Support vector machines are based in classifying subjects into suicidal and non-
on a computational learning theory called suicidal subject groups. Classification per-
structural risk minimization, whose goal is formances are shown for adolescents,
to find a hypothesis with the lowest true adults, and the combined adolescent and
error (Vapnik, 1999). The SVM constructs adult cohort. The table shows that the
a hyperplane in a high-dimensional space, ROC threshold of 0.80 is met in all cases,
which can be used for classification, regres- except adults in the adult suicide versus
sion, or other tasks (Press, Teukolsky, Vet- mentally ill comparison. When these data
terling, & Flannery, 2007). The SVM is are combined with adolescent data, how-
appropriate for this studys data because it ever, the threshold is met. The table also
can be used on multiple class problems, and shows that the signal from acoustic charac-
because its connection to computational teristics boosts the suicidal versus mentally
learning enables it to be a universal learner ill classification.
(Joachims, 1998) while in support of con- Overall, the results show that
ceptual frameworks, such as spreading acti- machine learning algorithms can be trained
vation. Consequently, it tends to be fairly to automatically identify the suicidal sub-
robust to overfitting (Sebastiani, 2002). The jects in a group of suicidal, mentally ill, and
performance of the classifier was based on control subjects. Moreover, the inclusion of
the area under the receiver operating acoustic characteristics is most helpful when
curve, which was estimated using leave-one- classifying between suicidal and mentally ill
out by subject cross-validation (Efron & subjects.
6
TABLE 1
Averages and Standard Deviations (SD) of Demographic and Psychological Rating Scale Measures for Males and Females, According to Site
Cincinnati Childrens Hospital
Medical Center Princeton Community Hospital University of Cincinnati Medical Center
All hospitals
DISCUSSION
Adolescents +
Suicidal versus Mentally Ill and Controls
0.87 (0.02)
0.76 (0.03)
0.87 (0.02)
ROC (SD)
Adults
We anticipated that when computation
The AROC for the Machine Learning Algorithm. The Nonsuicidal Group Comprises of Either Mentally Ill and Control Subjects. Classification methods are applied in multiple sites, their
overall accuracy decreases. This decrease
could be reduced by including a second
ROC (SD)
0.84 (0.03)
0.80 (0.03)
0.84 (0.03)
measurement mode, such as acoustic data.
Adults
Although there was a decrease from our ini-
tial study (Pestian et al., 2015), the decrease
was not substantial. Moreover, the acoustic
features did not play a substantial role in the
Adolescents
ROC (SD)
0.82 (0.04)
0.74 (0.05)
0.81 (0.04)
initial study interview. Subsequent research,
however, has shown that in some cases the
acoustic features are statistically important
during follow-up visits (Venek, Scherer,
Morency, Rizzo, & Pestian, 2016; Venek
Adolescents +
ROC (SD)
0.79 (0.03)
0.76 (0.03)
0.82 (0.03)
0.77 (0.04)
0.74 (0.04)
0.77 (0.04)
0.82 (0.05)
0.69 (0.06)
0.80 (0.05)
ROC (SD)
0.93 (0.02)
0.79 (0.03)
0.92 (0.02)
ROC (SD)
0.91 (0.02)
0.82 (0.03)
0.93 (0.02)
CONCLUSION
Adults
0.87 (0.04)
0.74 (0.05)
0.83 (0.05)
Acoustics
TABLE 2
Acoustics
TABLE 3
Sentiments Expressed During the Interviews by Group
Sentiment expressed % of suicidal % of mentally ANOVA
by subject interviews ill interviews % of controls p value
REFERENCES
ALKU, P., BACKSTR , T., & VILKMAN, E.
OM ET AL. (2014). Neurobiology of suicide: Do
(2002). Normalized amplitude quotient for biomarkers exist? International Journal of Legal
parametrization of the glottal flow. The Journal Medicine, 128, 7382.
of the Acoustical Society of America, 112, 701710. CUMMINS, N., SCHERER, S., KRAJEWSKI, J.,
ALKU, P., STRIK, H., & VILKMAN, E. SCHNIEDER, S., EPPS, J., & QUATIERI, T. F.
(1997). Parabolic spectral parameterA new (2015). A review of depression and suicide risk
method for quantification of the glottal flow. assessment using speech analysis. Speech Commu-
Speech Communication, 22, 6779. nication, 71, 1049.
American Psychiatric Association. (2003). DALESSANDRO, C., & STURMEL, N.
Practice guideline for the assessment and treat- (2011). Glottal closure instant and voice source
ment of patients with suicidal behaviors. Ameri- analysis using time-scale lines of maximum
can Journal of Psychiatry, 160(Suppl), 160. amplitude. Sadhana, 36, 601622.
BECK, A. T., BECK, R., & KOVACS, M. DEGOTTEX, G., KANE, J., DRUGMAN, T.,
(1975). Classification of suicidal behaviors: I. RAITIO, T., & SCHERER, S. (2014). CovarepA
Quantifying intent and medical lethality. The collaborative voice analysis repository for speech
American Journal of Psychiatry, 132, 28587. technologies. In Proceedings of the International
BECK, A. T., KOVACS, M., & WEISSMAN, Conference on Acoustics, Speech and Signal Processing
A. (1979). Assessment of suicidal intention: The (pp. 960964), Florence, Italy.
Scale for Suicide Ideation. Journal of Consulting DESMET, B. (2014, May). Finding the online
and Clinical Psychology, 47, 343. cry for help, PhD diss., Universiteit Gent, Gent,
BURK , F., KURZ, A., & MOLLER , H.-J. Belgium.
(1985). Suicide risk scales: Do they help to pre- DODGE, Y. (2006). The Oxford dictionary of
dict suicidal behaviour? European Archives of Psy- statistical terms: Entry for normalization. New
chiatry and Neurological Sciences, 235, 153157. York: Oxford University Press.
CHILDERS, D. G., & LEE, C. (1991). Vocal DRUGMAN, T., & ALWAN, A. (2011). Joint
quality factors: Analysis, synthesis, and percep- robust voicing detection and pitch estimation
tion. The Journal of the Acoustical Society of Amer- based on residual harmonics. In Interspeech (pp.
ica, 90, 23942410. 19731976), Florence, Italy.
Columbia-Suicide Severity Rating Scale. EFRON, B., & TIBSHIRANI, R. (1997).
(2015) TrainingPublic health settings. Retrieved Improvements on cross-validation: The .632 +
from http://www.cssrs.columbia.edu/training_cssrs. bootstrap method. Journal of the American Statis-
html tical Association, 92, 548560.
COSTANZA, A., DORTA, I., PERROUD, N., GOLDSTEIN, R. B., BLACK, D. W., NASRAL-
BURKHARDT, S., MALAFOSSE, A., MANGIN, P., LAH, A., & WINOKUR, G. (1991). The prediction
PESTIAN ET AL. 9
of suicide: Sensitivity, specificity, and predictive the Workshop on Current Trends in Biomedical
value of a multivariate model applied to suicide Natural Language Processing (pp. 179184).
among 1906 patients with affective disorders. Stroudsburg, PA: Association for Computational
Archives of General Psychiatry, 48, 418422. Linguistics.
GOMEZ, J. M. (2014). Language technolo- MOLINARO, A., SIMON, R., & PFEIFFER, R.
gies for suicide prevention in social media. In (2005). Prediction error estimation: A compar-
Workshop on Natural Language Processing in the ison of resampling methods. Bioinformatics, 21,
5th Information Systems Research Working Days, 33013307.
(pp. 21-17), Quito, Ecuador.
MULLER , M. J., & DRAGICEVIC, A. (2003).
HACKI, T. (1989). Klassifizierung von Standardized rater training for the Hamilton
glottisdysfunktionen mit hilfe der elektroglot- Depression Rating Scale (HAMD-17) in psychi-
tographie. Folia Phoniatrica et Logopaedica, 41, 43 atric novices. Journal of Affective Disorders, 77,
48. 6569.
HILLENBRAND, J., CLEVELAND, R. A., & MUNDT, J. C., GREIST, J. H., & JEFFER-
ERICKSON, R. L. (1994). Acoustic correlates of SON, J. W. (2013). Original research prediction
breathy vocal quality. Journal of Speech, Language, of suicidal behavior in clinical research by. Jour-
and Hearing Research, 37, 769778. nal of Clinical Psychiatry, 74, 887893.
HUANG, Y. P., GOH, T., & LIEW, C. L. NEWMAN, M. L., PENNEBAKER, J. W.,
(2007). Hunting suicide notes in web 2.0-preli- BERRY, D. S., & RICHARDS, J. M. (2003). Lying
minary findings. In Proceedings of the 9th IEEE words: Predicting deception from linguistic
International Symposium on Multimedia (pp. 517 styles. Personality and Social Psychology Bulletin, 29,
521). Taichung, Tiawan. 665675.
HUGHES, D. H. (1995). Can the clinician OLAV NIELSSEN, M. (2012). Suicidal idea-
predict suicide? Psychiatric Services, 46, 449451. tion and later suicide. The American Journal of
JASHINSKY, J., BURTON, S. H., HANSON, C. Psychiatry, 169, 662.
L., WEST, J., GIRAUD-CARRIER, C., BARNES, M. PARIS, J. (2006). Predicting and preventing
D., ET AL. (2015). Tracking suicide risk factors suicide: Do we know enough to do either? Har-
through twitter in the US. Crisis, 35, 5159. vard Review of Psychiatry, 14, 233240.
JOACHIMS, T. (1998). Text categorization PESTIAN, J. P. (2010). A conversation with
with support vector machines: Learning with many Edwin Shneidman. Suicide and Life-Threatening
relevant features. Berlin: Springer. Behavior, 40, 516523.
KANE, J. (2012). Tools for analysing the PESTIAN, J. P., GRUPP-PHELAN, J., BRETON-
voice. PhD diss., Trinity College, Dublin, NEL COHEN, K., MEYERS, G., RICHEY, L. A.,
Ireland. MATYKIEWICZ, P., ET AL. (2015). A controlled
KOVACS, M., & GARRISON, B. (1985). trial using natural language processing to exam-
Hopelessness and eventual suicide: A 10-year ine the language of suicidal adolescents in the
prospective study of patients hospitalized with emergency department. Suicide and Life-Threaten-
suicidal ideation. American Journal of Psychiatry, ing Behavior, 46, 154159.
142, 559563. PESTIAN, J. P., MATYKIEWICZ, P., &
KWON, O.-W., CHAN, K., HAO, J., & LEE, GRUPP-PHELAN, J. (2008). Using natural language
T.-W. (2003). Emotion recognition by speech processing to classify suicide notes. In Proceedings
signals. In. Eurospeech (pp. 125128), Geneva, of the Workshop on Current Trends in Biomedical
Switzerland. Natural Language Processing (pp. 9697). Strouds-
LARGE, M., & RYAN, C. (2014). Suicide burg, PA: Association for Computational Lin-
risk assessment: Myth and reality. International guistics.
Journal of Clinical Practice, 68, 679681. PESTIAN, J. P., MATYKIEWICZ, P., & LINN-
LE-NICULESCU, H., LEVEY, D., AYALEW, GUST, M. (2012). Whats in a note: Construction
M., PALMER, L., GAVRIN, L., JAIN, N., ET AL. of a suicide note corpus. Biomedical Informatics
(2013). Discovery and validation of blood Insights, 5, 1.
biomarkers for suicidality. Molecular Psychiatry, POKORNY, A. D. (1983). Prediction of sui-
18, 12491264. cide in psychiatric patients: Report of a prospective
LI, T. M., NG, B. C., CHAU, M., WONG, study. Archives of General Psychiatry, 40, 249257.
P. W., & YIP, P. S. (2013). Collective intelli- POSNER, K., BRENT, D., LUCAS, C.,
gence for suicide surveillance in web forums. In GOULD, M., STANLEY, B., BROWN, G., ET AL.
Intelligence and security informatics (pp. 2937). (2008). Columbia-Suicide Severity Rating Scale (C-
Berlin: Springer. SSRS). New York, NY: New York State Psychi-
MATYKIEWICZ, P., DUCH, W., & PESTIAN, atric Institute.
J. (2009). Clustering semantic spaces of suicide PRESS, W. H., TEUKOLSKY, S. A., VETTER-
notes and newsgroups articles. In Proceedings of LING, W. T., & FLANNERY, B. (2007).
10 SUICIDE THOUGHT MARKERS
Section 16.5. Support vector machines. In suicidal risk assessment in clinician-patient inter-
Numerical recipes: The art of scientific computing, action: A study of verbal and acoustic behaviors.
(3rd ed.). New York: Cambridge University In Proceedings of the IEEE Spoken Language Tech-
Press. nology Workshop, (pp. 277282), Lake Tahoe,
PRESTON, E., & HANSEN, L. (2005). A sys- Nevada.
tematic review of suicide rating scales in shi- VENEK, V., SCHERER, S., MORENCY, L.-P.,
zophrenia. Crisis, 26, 170180. RIZZO, A., & PESTIAN, J. (2016). Adolescent suici-
SCHERER, S., MORENCY, L.-P., GRATCH, J., dal risk assessment in clinician-patient interac-
PESTIAN, J., & PLAYA VISTA, C. (2015). Reduced tion. In IEEETransactions on Affective Computing.
vowel space is a robust indicator of psychological VOORHEES, E. M., HARMAN, D. K., ET AL.
distress: A crosscorpus analysis. In IEEE Interna- (2005). TREC: Experiment and evaluation in infor-
tional Conference on Acoustics, Speech, and Signal mation retrieval. Cambridge: MIT Press.
Processing. (pp. 47894793), Brisbane, Australia. WINTERS, N. C., MYERS, K., & PROUD, L.
SCHOLKOPF , B., & SMOLA, A. (1998). Sup- (2002). Ten-year review of rating scales. III:
port vector machines. In Encyclopedia of biostatistics Scales assessing suicidality, cognitive style, and
New York, NY: John Willey & Sons. self-esteem. Journal of the American Academy of
SEBASTIANI, F. (2002 March). Machine Child & Adolescent Psychiatry, 41, 11501181.
learning in automated text categorization. ACM ZHANG, L., HUANG, X., LIU, T., LI, A.,
Computing Surveys, 34, 147. CHEN, Z., & ZHU, T. (2015). Using linguistic
THOMPSON, P., POULIN, C., & BRYAN, C. features to estimate suicide probability of
J. (2014). Predicting military and veteran suicide Chinese microblog users. In Human Centered
risk: Cultural aspects. In Proceedings from the Computing (pp. 549559). Cham, Switzerland:
Workshop on Computational Linguistics and Clinical Springer.
Psychology (pp. 16) Baltimore, MD.
U.S. Food and Drug Administration. Manuscript Received: January 17, 2016
(2012). Guidance for industry: Suicidal ideation and Revision Accepted: June 28, 2016
behavior: Prospective assessment of occurrence in clini-
cal trials. Silver Springs, MD: Author. Retrieved
October, 5, 2012, from http://www.fda.gov/
Drugs/GuidanceComplianceRegulatoryInforma-
tion/htm. SUPPORTING INFORMATION
VAPNIK, V. N. (1999). An overview of sta-
tistical learning theory. IEEE Transactions on Additional Supporting Information may be
Neural Networks, 10, 988999.
VENEK, V., SCHERER, S., MORENCY, L.-P., found in the online version of this article:
RIZZO, A., & PESTIAN, J. (2014). Adolescent Appendix S1.