SLTB 12312

Suicide and Life-Threatening Behavior 1
2016 The American Association of Suicidology

DOI: 10.1111/sltb.12312
A Machine Learning Approach to Identifying

the Thought Markers of Suicidal Subjects: A
Prospective Multicenter Trial
JOHN P. PESTIAN, PHD, MICHAEL SORTER, MD, BRIAN CONNOLLY, PHD, KEVIN
BRETONNEL COHEN, PHD, CHERYL MCCULLUMSMITH, MD, PHD, JEFFRY T. GEE, MD,
LOUIS-PHILIPPE MORENCY, PHD, STEFAN SCHERER, PHD, AND LESLEY ROHLFS, MS, FOR
THE STM RESEARCH GROUP*
Death by suicide demonstrates profound personal suffering and societal

failure. While basic sciences provide the opportunity to understand biological
markers related to suicide, computer science provides opportunities to under-
stand suicide thought markers. In this novel prospective, multimodal, multicen-
ter, mixed demographic study, we used machine learning to measure and fuse
two classes of suicidal thought markers: verbal and nonverbal. Machine learning
algorithms were used with the subjects words and vocal characteristics to clas-
sify 379 subjects recruited from two academic medical centers and a rural com-
munity hospital into one of three groups: suicidal, mentally ill but not suicidal,
or controls. By combining linguistic and acoustic characteristics, subjects could
be classified into one of the three groups with up to 85% accuracy. The results
provide insight into how advanced technology can be used for suicide assess-
ment and prevention.
Predicting when someone will commit sui- that contribute to suicide risk is possible with
cide has been nearly impossible (American standardized, clinical tools when used by
Psychiatric Association, 2003; Goldstein, well-trained clinicians (Beck, Beck, &
Black, Nasrallah, & Winokur, 1991; Hughes, Kovacs, 1975; Beck, Kovacs, & Weissman,
1995; Large & Ryan, 2014; Olav Nielssen, 1979; Burk, Kurz, & M oller, 1985; Colum-
2012; Paris, 2006), but classifying the factors bia-Suicide Severity Rating Scale, 2015;
JOHN P. PESTIAN, Division of Biomedical OH, USA; JEFFRY T. GEE, Princeton Commu-
Informatics, Cincinnati Childrens Hospital nity Hospital, Princeton, WV, USA; LOUIS-
Medical Center, University of Cincinnati, PHILIPPE MORENCY, Language Technologies
Cincinnati, OH, USA; MICHAEL SORTER, Divi- Institute, Carnegie Mellon University, Pitts-
sion of Psychiatry, Cincinnati Childrens Hospi- burgh, PA, USA; STEFAN SCHERER, Institute for
tal Medical Center, University of Cincinnati, Creative Technologies, University of Southern
Cincinnati, OH, USA; BRIAN CONNOLLY, Divi- California, Los Angeles, CA, USA.
sion of Biomedical Informatics, Cincinnati Chil- *STM Research Group: Robert Faist,
drens Hospital Medical Center, University of Leslie Korbee, Lisa McCord, Rachel McCourt,
Cincinnati, Cincinnati, OH, USA; KEVIN and Colleen Murphy.
BRETONNEL COHEN, Computational Bioscience Address correspondence to John P.
Program, University of Colorado School of Pestian, Cincinnati Childrens Hospital Medical
Medicine, Denver, CO, USA; CHERYL MCCUL- Center, University of Cincinnati, 3333 Burnet
LUMSMITH, Department of Psychiatry, College of Ave., Cincinnati, OH 45229; E-mail: john.
Medicine, University of Cincinnati, Cincinnati, pestian@cchmc.org
2 SUICIDE THOUGHT MARKERS
Kovacs & Garrison, 1985; M uller & Drag- died by suicide and simulated suicide notes
icevic, 2003; Mundt, Greist, & Jefferson, written by age- and gender-matched con-
2013; Pokorny, 1983; Posner et al., 2008; trols better than mental health professionals
Preston & Hansen, 2005; U.S. Food & Drug could (71% vs. 79%; Pestian et al., 2008).
Administration, 2012; Winters, Myers, & In an international, shared task-setting that
Proud, 2002). Such tools can, however, be included multiple groups sharing the same
cumbersome and may not reliably translate task definition, data set, and scoring metric
into routine interactions between clinicians, (Voorhees et al., 2005), 24 teams developed
caregivers, or educators. Here, we describe a and tested computational algorithms to
novel approach using the subjects linguistic identify emotions in over 1,319 suicide
and acoustic patterns to classify subjects notes written shortly before death. The
automatically as either suicidal, mentally ill results showed that the fusion of multiple
but not suicidal, or a control. methods outperform single methods (Pes-
tian, Matykiewicz, & Linn-Gust, 2012).
Suicidal thought markers have also
BACKGROUND been studied prospectively. The Suicidal
Adolescent Clinical Trial (Pestian et al.,
Efforts to understand suicide risks 2015), the single-site precursor to this study,
can be roughly clustered into traits or which used machine learning to analyze
states. Trait analyses focus on stable charac- interviews with 60 suicidal and control
teristics rooted in and measured using bio- patients, classified patients into suicidal or
logical processes (Costanza et al., 2014; Le- control groups with greater than 90% accu-
Niculescu et al., 2013), whereas state analy- racy (Pestian et al., 2015). Analysis of acous-
ses measure dynamic characteristics like ver- tic features such as pauses and vowel spacing
bal and nonverbal communication, termed yielded similar results (Scherer, Morency,
thought markers (Pestian et al., 2015). Gratch, Pestian, & Playa Vista, 2015;
Machine learning and natural language pro- Venek, Scherer, Morency, Rizzo, & Pestian,
cessing have successfully identified differ- 2014). The study described herein is novel
ences in retrospective suicide notes, because it uses a multisite, multicultural set-
newsgroups, and social media (Gomez, ting to show that machine learning algo-
2014; Huang, Goh, & Liew, 2007; Matykie- rithms can be trained to automatically
wicz, Duch, & Pestian, 2009). Jashinsky identify the suicidal subjects in a group of
et al. (2015) used multiple annotators to suicidal, mentally ill, and control subjects.
identify the risk of suicide from the key- Moreover, the inclusion of acoustic charac-
words and phrases (interrater reliabil- teristics is most helpful when classifying
ity = .79) in geographically based tweets. between suicidal and mentally ill subjects.
Thompson, Poulin, and Bryan (2014) and
Desmet (2014) used text-based signals to
identify suicide risk that ranged from 60% METHODS
to 90%. Li, Ng, Chau, Wong, and Yip
(2013) presented a framework using Subject Enrollment
machine learning to identify individuals
expressing suicidal thoughts in web forums; Between October 2013 and March
Zhang et al. (2015) used microblog data to 2015, 379 subjects were enrolled from
build machine learning models that identi- emergency departments (EDs) and inpatient
fied suicidal bloggers with approximately and outpatient centers into a three-site,
90% accuracy. Pestian, Matykiewicz, and internal review board-approved prospective
Grupp-Phelan (2008) demonstrated that clinical trial. One hundred twenty-six
machine learning algorithms could distin- subjects were enrolled at Cincinnati Chil-
guish between notes written by people who drens Hospital Medical Center (CCHMC),
PESTIAN ET AL. 3
a 600-bed urban level 1 academic medical Data Collection

center. One hundred twenty-eight subjects
were enrolled at the University of Cincin- Data were collected and validated by
nati Medical Center (UC), a 498-bed urban trained mental health professionals. During
academic medical center. One hundred enrollment, each subject completed stan-
twenty-five subjects were enrolled at dardized tools: Columbia-Suicide Severity
Princeton Community Hospital (PCH), a Rating Scale, Young Mania Rating Scale,
267-bed Appalachian community hospital in and Hamilton Rating Scale for Depression.
southern West Virginia. The inclusion and Each subject also completed the ubiquitous
exclusion criteria classified subjects into one questionnaire (UQ), a semistructured inter-
of three groups: suicidal, mentally ill, or view with five open-ended questions to
control. stimulate conversation for language sam-
Multi-gated inclusion criteria were pling: Do you have hope? Do you have
used. For the first gate, all patients were any fear? Do you have any secrets? Are
reviewed using the electronic status board for you angry? and Does it hurt emotion-
a complaint of suicide, suicidal ideation, or ally? (Pestian, 2010; Pestian et al., 2015).
psychiatric evaluation. For patients with Both subject and interviewer were video
mental illness and for control subjects, any and audio recorded during the UQs. The
complaint was accepted except those related results were transcribed with 98% accuracy
to suicide. For those who passed the first based on completeness, accuracy of tran-
gate, a second review of their electronic med- scription, and adherence to the transcrip-
ical record (EMR) was conducted before they tion guidelines.
were approached for enrollment. Suicidal
subjects were approached if they had come to Computational Analysis
the EDs or psychiatric units because of suici-
dal ideation or attempts within the previous Linguistic and Acoustic Feature Extrac-
24 hours. Patients with mental illness were tion. Our analysis is based on two types of
enrolled from the ED and outpatient mental features: linguistic and acoustic. Feature
health clinics if they had a definitive mental extraction was performed automatically
illness diagnosis but had not had prior suici- using the patient audio signals recorded
dal attempts, active thoughts of suicide, or during the interviews and their transcrip-
plans to die by suicide within the previous tions. Linguistically, we followed related
year as reported by the patient and EMR. work on automatic identification and
Control subjects were patients who came to extraction of word instances (unigrams) and
the ED with no history of mental health diag- word-pair instances (bi-grams) from the
noses or suicidal ideation, as reported by the transcriptions. A dictionary that includes all
patient and EMR (Figure 1). spoken words, word-pairs, and acoustic
Potential subjects were excluded if characteristics was created.
their native language was not English, if The selected vocal and prosodic
they had any serious medical injury or men- characteristics include: vocal dynamics
tal retardation that could prohibit consent, fundamental frequency (f0; Drugman &
if they would be unavailable for a follow-up Alwan, 2011) and square of the amplitude
interview, or if they could not comply with (A2); voice qualityharmonic richness fac-
study procedures. tor (Childers & Lee, 1991), maximum dis-
Participation incentives were site-spe- persion quotient (Kane, 2012), peak slope
cific. CCHMC and UC subjects were paid (DAlessandro & Sturmel, 2011), difference
$50 for the initial interview and $25 for the between the first and second harmonics
follow-up. PCH subjects were paid $25 for (Hillenbrand, Cleveland, & Erickson,
the initial interview and $25 for the follow- 1994), normalized amplitude quotient
up interview. (Alku, Backstr om, & Vilkman, 2002),
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
1.0 0.8 0.6 0.4 0.2 0.0
Specificity
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
1.0 0.8 0.6 0.4 0.2 0.0
Specificity
1.0
0.8
Sensitivity
0.6
0.4
0.2
0.0
1.0 0.8 0.6 0.4 0.2 0.0
Specificity
Figure 1. Receiver operator curve (ROC): suicide versus control (upper), suicide versus mentally ill (middle), and
suicide versus mentally ill with control. The ROC curves for adolescents (blue), adults (red), and all subjects (black)
generated where the nonsuicidal population is controls (top), mentally ill (middle), and mentally ill and controls,
using linguistic and acoustic features. The gray line is the AROC curve for a baseline (random) classifier.
PESTIAN ET AL. 5
quasi-open quotient (Hacki, 1989), and Tibshirani, 1997; Molinaro, Simon, &
parabolic spectral parameters (Alku, Strik, Pfeiffer, 2005). Data were analyzed by
& Vilkman, 1997); the vocal tract reso- members of the study team.
nance frequencies, as well as pause lengths
characterized by formants F1 through F5
(Kwon, Chan, Hao, & Lee, 2003); and RESULTS
pause lengths that have been correlated
with depression (Cummins et al., 2015). Subjects
Once a subset of all features were
extracted using the COVAREP software A total of 955 patients were approached
(Degottex, Kane, Drugman, Raitio, & for enrollment. Of that group, 576 subjects
Scherer, 2014), they were normalized by did not meet the inclusion criteria (n = 70),
adjusting the measured values from the refused to participate (n = 436), or were
various features to a common zero-to-one excluded for other reasons (n = 70). This
scale (Dodge, 2006). resulted in 379 subjects enrolled, comprising
Machine Learning Algorithms. A goal 130 suicidal patients, 126 nonsuicidal
of machine learning is to train a computa- patients with mental illness, and 123 con-
tional model from selected data that can trols. Eight subject interviews were incom-
then generalize to unseen (test) data. plete and were excluded from the final
Machine learning can be roughly divided analysis. A total of 371 subjects completed
into three types: supervised learning, when the study. The patient demographic charac-
the training data are already labelled; semi- teristics within each study site are shown in
supervised learning, when only part of the Table 1.
training set is labelled; and unsupervised
learning, when the challenge is to learn Comparison of the Performance of
structure in unlabelled data. Here, a super- Classification Algorithms
vised learning support vector machine
(SVM) approach was used (Sch olkopf & Figure 1 and Table 2 show the per-
Smola, 1998). formance of the machine learning algorithm
Support vector machines are based in classifying subjects into suicidal and non-
on a computational learning theory called suicidal subject groups. Classification per-
structural risk minimization, whose goal is formances are shown for adolescents,
to find a hypothesis with the lowest true adults, and the combined adolescent and
error (Vapnik, 1999). The SVM constructs adult cohort. The table shows that the
a hyperplane in a high-dimensional space, ROC threshold of 0.80 is met in all cases,
which can be used for classification, regres- except adults in the adult suicide versus
sion, or other tasks (Press, Teukolsky, Vet- mentally ill comparison. When these data
terling, & Flannery, 2007). The SVM is are combined with adolescent data, how-
appropriate for this studys data because it ever, the threshold is met. The table also
can be used on multiple class problems, and shows that the signal from acoustic charac-
because its connection to computational teristics boosts the suicidal versus mentally
learning enables it to be a universal learner ill classification.
(Joachims, 1998) while in support of con- Overall, the results show that
ceptual frameworks, such as spreading acti- machine learning algorithms can be trained
vation. Consequently, it tends to be fairly to automatically identify the suicidal sub-
robust to overfitting (Sebastiani, 2002). The jects in a group of suicidal, mentally ill, and
performance of the classifier was based on control subjects. Moreover, the inclusion of
the area under the receiver operating acoustic characteristics is most helpful when
curve, which was estimated using leave-one- classifying between suicidal and mentally ill
out by subject cross-validation (Efron & subjects.
6
TABLE 1
Averages and Standard Deviations (SD) of Demographic and Psychological Rating Scale Measures for Males and Females, According to Site
Cincinnati Childrens Hospital
Medical Center Princeton Community Hospital University of Cincinnati Medical Center
Male Female Total Male Female Total Male Female Total

(SD) (SD) p value (SD) (SD) (SD) p value (SD) (SD) (SD) p value (SD)
n 39 83 <0.0001 122 46 77 0.007 123 59 67 0.53 126

Average 15.9 15.5 0.08 15.6 41.0 43.0 0.38 42.2 42.1 43.1 0.67 42.6
age in (1.1) (1.3) (1.3) (12.1) (12.3) (12.3) (12.9) (13.2) (13.2)
years
Average 8.4 7.2 0.06 7.6 6.7 6.6 0.90 6.7 8.8 10.8 0.03 9.9
time of (3.3) (3.0) (3.2) (4.7) (4.0) (4.2) (4.6) (5.4) (5.2)
interview
in minutes
Patients 74 154 <0.0001 228 204 326 <0.0001 530 94 103 0.57 197
approached
All hospitals
Male (SD) Female (SD) p value Total (SD)
n 144 227 <0.0001 371

Average age in years 34.6 (15.7) 33.0 (16.7) 0.35 33.6 (16.4)
Average time of interview in minutes 8.1 (4.5) 8.1 (4.5) 1.0 8.1 (4.5)
Patients approached 372 583 <0.0001 955
SUICIDE THOUGHT MARKERS
PESTIAN ET AL. 7
DISCUSSION
Adolescents +
Suicidal versus Mentally Ill and Controls
0.87 (0.02)
0.76 (0.03)
0.87 (0.02)
ROC (SD)
Adults
We anticipated that when computation
The AROC for the Machine Learning Algorithm. The Nonsuicidal Group Comprises of Either Mentally Ill and Control Subjects. Classification methods are applied in multiple sites, their
overall accuracy decreases. This decrease
could be reduced by including a second
ROC (SD)
0.84 (0.03)
0.80 (0.03)
0.84 (0.03)
measurement mode, such as acoustic data.
Adults
Although there was a decrease from our ini-
tial study (Pestian et al., 2015), the decrease
was not substantial. Moreover, the acoustic
features did not play a substantial role in the
Adolescents
ROC (SD)
0.82 (0.04)
0.74 (0.05)
0.81 (0.04)
initial study interview. Subsequent research,
however, has shown that in some cases the
acoustic features are statistically important
during follow-up visits (Venek, Scherer,
Morency, Rizzo, & Pestian, 2016; Venek
Adolescents +
ROC (SD)
0.79 (0.03)
0.76 (0.03)
0.82 (0.03)
et al., 2014). From the results of other stud-

Adults
ies, this was unexpected. Suggesting that

Performances are Shown for Adolescents, Adults, and the Combined Adolescent and Adult Cohorts
Suicidal versus Mentally Ill
additional research devoted to fusing nonver-

bal sentiment data with verbal sentiment data
is still needed.
ROC (SD)
0.77 (0.04)
0.74 (0.04)
0.77 (0.04)
Clinicians may ask: Were there any

Adults
differences in what the subjects said?

Table 3 shows that the mentally ill and the
control patients tended to laugh more dur-
ing interviews, sigh less, and express less
Adolescents
ROC (SD)
0.82 (0.05)
0.69 (0.06)
0.80 (0.05)
anger, less emotional pain, and more hope.

The studys sample was based on the
subjects self-reports, which means that
some patients could have been disingenu-
ous. But, using measures of authenticity, no
Adolescents +
ROC (SD)
0.93 (0.02)
0.79 (0.03)
0.92 (0.02)
differences were found between each of the

Adults
groups (Newman, Pennebaker, Berry, &

Richards, 2003).
Suicidal versus Controls
ROC (SD)
0.91 (0.02)
0.82 (0.03)
0.93 (0.02)
CONCLUSION
Adults
This studys methodology presents

strong evidence for a useful objective tool
that clinicians and others can use to deter-
Adolescents
ROC (SD)
0.87 (0.04)
0.74 (0.05)
0.83 (0.05)
mine suicidal intention. These computational

approaches may provide novel opportunities
for large-scale innovations in suicidal care.
The methodology described here can be
readily translated to such settings as school,
Linguistics +
shelters, youth clubs, juvenile justice centers,

Linguistics
Acoustics
TABLE 2
Acoustics
and community centers, where earlier identi-

fication may help to reduce suicide attempts
and deaths.
TABLE 3
Sentiments Expressed During the Interviews by Group
Sentiment expressed % of suicidal % of mentally ANOVA
by subject interviews ill interviews % of controls p value
Laughed during interview 39.4 69.1 71.9 <0.0001

Responded negatively to Are 47.2 67.5 86.0 <0.0001
you angry?
Laughed during response to 12.6 35.0 41.3 <0.0001
anger question
Sighed during the interview 14.2 0.81 1.65 0.05
Responded negatively to 40.9 66.7 70.2 <0.0001
Does it hurt emotionally?
Used the word hope in 48.0 72.4 75.2 <0.0001
response to the question Do
you have hope?
REFERENCES

ALKU, P., BACKSTR , T., & VILKMAN, E.
OM ET AL. (2014). Neurobiology of suicide: Do
(2002). Normalized amplitude quotient for biomarkers exist? International Journal of Legal
parametrization of the glottal flow. The Journal Medicine, 128, 7382.
of the Acoustical Society of America, 112, 701710. CUMMINS, N., SCHERER, S., KRAJEWSKI, J.,
ALKU, P., STRIK, H., & VILKMAN, E. SCHNIEDER, S., EPPS, J., & QUATIERI, T. F.
(1997). Parabolic spectral parameterA new (2015). A review of depression and suicide risk
method for quantification of the glottal flow. assessment using speech analysis. Speech Commu-
Speech Communication, 22, 6779. nication, 71, 1049.
American Psychiatric Association. (2003). DALESSANDRO, C., & STURMEL, N.
Practice guideline for the assessment and treat- (2011). Glottal closure instant and voice source
ment of patients with suicidal behaviors. Ameri- analysis using time-scale lines of maximum
can Journal of Psychiatry, 160(Suppl), 160. amplitude. Sadhana, 36, 601622.
BECK, A. T., BECK, R., & KOVACS, M. DEGOTTEX, G., KANE, J., DRUGMAN, T.,
(1975). Classification of suicidal behaviors: I. RAITIO, T., & SCHERER, S. (2014). CovarepA
Quantifying intent and medical lethality. The collaborative voice analysis repository for speech
American Journal of Psychiatry, 132, 28587. technologies. In Proceedings of the International
BECK, A. T., KOVACS, M., & WEISSMAN, Conference on Acoustics, Speech and Signal Processing
A. (1979). Assessment of suicidal intention: The (pp. 960964), Florence, Italy.
Scale for Suicide Ideation. Journal of Consulting DESMET, B. (2014, May). Finding the online
and Clinical Psychology, 47, 343. cry for help, PhD diss., Universiteit Gent, Gent,

BURK , F., KURZ, A., & MOLLER , H.-J. Belgium.
(1985). Suicide risk scales: Do they help to pre- DODGE, Y. (2006). The Oxford dictionary of
dict suicidal behaviour? European Archives of Psy- statistical terms: Entry for normalization. New
chiatry and Neurological Sciences, 235, 153157. York: Oxford University Press.
CHILDERS, D. G., & LEE, C. (1991). Vocal DRUGMAN, T., & ALWAN, A. (2011). Joint
quality factors: Analysis, synthesis, and percep- robust voicing detection and pitch estimation
tion. The Journal of the Acoustical Society of Amer- based on residual harmonics. In Interspeech (pp.
ica, 90, 23942410. 19731976), Florence, Italy.
Columbia-Suicide Severity Rating Scale. EFRON, B., & TIBSHIRANI, R. (1997).
(2015) TrainingPublic health settings. Retrieved Improvements on cross-validation: The .632 +
from http://www.cssrs.columbia.edu/training_cssrs. bootstrap method. Journal of the American Statis-
html tical Association, 92, 548560.
COSTANZA, A., DORTA, I., PERROUD, N., GOLDSTEIN, R. B., BLACK, D. W., NASRAL-
BURKHARDT, S., MALAFOSSE, A., MANGIN, P., LAH, A., & WINOKUR, G. (1991). The prediction
PESTIAN ET AL. 9
of suicide: Sensitivity, specificity, and predictive the Workshop on Current Trends in Biomedical
value of a multivariate model applied to suicide Natural Language Processing (pp. 179184).
among 1906 patients with affective disorders. Stroudsburg, PA: Association for Computational
Archives of General Psychiatry, 48, 418422. Linguistics.
GOMEZ, J. M. (2014). Language technolo- MOLINARO, A., SIMON, R., & PFEIFFER, R.
gies for suicide prevention in social media. In (2005). Prediction error estimation: A compar-
Workshop on Natural Language Processing in the ison of resampling methods. Bioinformatics, 21,
5th Information Systems Research Working Days, 33013307.
(pp. 21-17), Quito, Ecuador.
MULLER , M. J., & DRAGICEVIC, A. (2003).
HACKI, T. (1989). Klassifizierung von Standardized rater training for the Hamilton
glottisdysfunktionen mit hilfe der elektroglot- Depression Rating Scale (HAMD-17) in psychi-
tographie. Folia Phoniatrica et Logopaedica, 41, 43 atric novices. Journal of Affective Disorders, 77,
48. 6569.
HILLENBRAND, J., CLEVELAND, R. A., & MUNDT, J. C., GREIST, J. H., & JEFFER-
ERICKSON, R. L. (1994). Acoustic correlates of SON, J. W. (2013). Original research prediction
breathy vocal quality. Journal of Speech, Language, of suicidal behavior in clinical research by. Jour-
and Hearing Research, 37, 769778. nal of Clinical Psychiatry, 74, 887893.
HUANG, Y. P., GOH, T., & LIEW, C. L. NEWMAN, M. L., PENNEBAKER, J. W.,
(2007). Hunting suicide notes in web 2.0-preli- BERRY, D. S., & RICHARDS, J. M. (2003). Lying
minary findings. In Proceedings of the 9th IEEE words: Predicting deception from linguistic
International Symposium on Multimedia (pp. 517 styles. Personality and Social Psychology Bulletin, 29,
521). Taichung, Tiawan. 665675.
HUGHES, D. H. (1995). Can the clinician OLAV NIELSSEN, M. (2012). Suicidal idea-
predict suicide? Psychiatric Services, 46, 449451. tion and later suicide. The American Journal of
JASHINSKY, J., BURTON, S. H., HANSON, C. Psychiatry, 169, 662.
L., WEST, J., GIRAUD-CARRIER, C., BARNES, M. PARIS, J. (2006). Predicting and preventing
D., ET AL. (2015). Tracking suicide risk factors suicide: Do we know enough to do either? Har-
through twitter in the US. Crisis, 35, 5159. vard Review of Psychiatry, 14, 233240.
JOACHIMS, T. (1998). Text categorization PESTIAN, J. P. (2010). A conversation with
with support vector machines: Learning with many Edwin Shneidman. Suicide and Life-Threatening
relevant features. Berlin: Springer. Behavior, 40, 516523.
KANE, J. (2012). Tools for analysing the PESTIAN, J. P., GRUPP-PHELAN, J., BRETON-
voice. PhD diss., Trinity College, Dublin, NEL COHEN, K., MEYERS, G., RICHEY, L. A.,
Ireland. MATYKIEWICZ, P., ET AL. (2015). A controlled
KOVACS, M., & GARRISON, B. (1985). trial using natural language processing to exam-
Hopelessness and eventual suicide: A 10-year ine the language of suicidal adolescents in the
prospective study of patients hospitalized with emergency department. Suicide and Life-Threaten-
suicidal ideation. American Journal of Psychiatry, ing Behavior, 46, 154159.
142, 559563. PESTIAN, J. P., MATYKIEWICZ, P., &
KWON, O.-W., CHAN, K., HAO, J., & LEE, GRUPP-PHELAN, J. (2008). Using natural language
T.-W. (2003). Emotion recognition by speech processing to classify suicide notes. In Proceedings
signals. In. Eurospeech (pp. 125128), Geneva, of the Workshop on Current Trends in Biomedical
Switzerland. Natural Language Processing (pp. 9697). Strouds-
LARGE, M., & RYAN, C. (2014). Suicide burg, PA: Association for Computational Lin-
risk assessment: Myth and reality. International guistics.
Journal of Clinical Practice, 68, 679681. PESTIAN, J. P., MATYKIEWICZ, P., & LINN-
LE-NICULESCU, H., LEVEY, D., AYALEW, GUST, M. (2012). Whats in a note: Construction
M., PALMER, L., GAVRIN, L., JAIN, N., ET AL. of a suicide note corpus. Biomedical Informatics
(2013). Discovery and validation of blood Insights, 5, 1.
biomarkers for suicidality. Molecular Psychiatry, POKORNY, A. D. (1983). Prediction of sui-
18, 12491264. cide in psychiatric patients: Report of a prospective
LI, T. M., NG, B. C., CHAU, M., WONG, study. Archives of General Psychiatry, 40, 249257.
P. W., & YIP, P. S. (2013). Collective intelli- POSNER, K., BRENT, D., LUCAS, C.,
gence for suicide surveillance in web forums. In GOULD, M., STANLEY, B., BROWN, G., ET AL.
Intelligence and security informatics (pp. 2937). (2008). Columbia-Suicide Severity Rating Scale (C-
Berlin: Springer. SSRS). New York, NY: New York State Psychi-
MATYKIEWICZ, P., DUCH, W., & PESTIAN, atric Institute.
J. (2009). Clustering semantic spaces of suicide PRESS, W. H., TEUKOLSKY, S. A., VETTER-
notes and newsgroups articles. In Proceedings of LING, W. T., & FLANNERY, B. (2007).
Section 16.5. Support vector machines. In suicidal risk assessment in clinician-patient inter-
Numerical recipes: The art of scientific computing, action: A study of verbal and acoustic behaviors.
(3rd ed.). New York: Cambridge University In Proceedings of the IEEE Spoken Language Tech-
Press. nology Workshop, (pp. 277282), Lake Tahoe,
PRESTON, E., & HANSEN, L. (2005). A sys- Nevada.
tematic review of suicide rating scales in shi- VENEK, V., SCHERER, S., MORENCY, L.-P.,
zophrenia. Crisis, 26, 170180. RIZZO, A., & PESTIAN, J. (2016). Adolescent suici-
SCHERER, S., MORENCY, L.-P., GRATCH, J., dal risk assessment in clinician-patient interac-
PESTIAN, J., & PLAYA VISTA, C. (2015). Reduced tion. In IEEETransactions on Affective Computing.
vowel space is a robust indicator of psychological VOORHEES, E. M., HARMAN, D. K., ET AL.
distress: A crosscorpus analysis. In IEEE Interna- (2005). TREC: Experiment and evaluation in infor-
tional Conference on Acoustics, Speech, and Signal mation retrieval. Cambridge: MIT Press.
Processing. (pp. 47894793), Brisbane, Australia. WINTERS, N. C., MYERS, K., & PROUD, L.

SCHOLKOPF , B., & SMOLA, A. (1998). Sup- (2002). Ten-year review of rating scales. III:
port vector machines. In Encyclopedia of biostatistics Scales assessing suicidality, cognitive style, and
New York, NY: John Willey & Sons. self-esteem. Journal of the American Academy of
SEBASTIANI, F. (2002 March). Machine Child & Adolescent Psychiatry, 41, 11501181.
learning in automated text categorization. ACM ZHANG, L., HUANG, X., LIU, T., LI, A.,
Computing Surveys, 34, 147. CHEN, Z., & ZHU, T. (2015). Using linguistic
THOMPSON, P., POULIN, C., & BRYAN, C. features to estimate suicide probability of
J. (2014). Predicting military and veteran suicide Chinese microblog users. In Human Centered
risk: Cultural aspects. In Proceedings from the Computing (pp. 549559). Cham, Switzerland:
Workshop on Computational Linguistics and Clinical Springer.
Psychology (pp. 16) Baltimore, MD.
U.S. Food and Drug Administration. Manuscript Received: January 17, 2016
(2012). Guidance for industry: Suicidal ideation and Revision Accepted: June 28, 2016
behavior: Prospective assessment of occurrence in clini-
cal trials. Silver Springs, MD: Author. Retrieved
October, 5, 2012, from http://www.fda.gov/
Drugs/GuidanceComplianceRegulatoryInforma-
tion/htm. SUPPORTING INFORMATION
VAPNIK, V. N. (1999). An overview of sta-
tistical learning theory. IEEE Transactions on Additional Supporting Information may be
Neural Networks, 10, 988999.
VENEK, V., SCHERER, S., MORENCY, L.-P., found in the online version of this article:
RIZZO, A., & PESTIAN, J. (2014). Adolescent Appendix S1.

SLTB 12312

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SLTB 12312

Uploaded by

Copyright:

Available Formats

Suicide and Life-Threatening Behavior 1

2016 The American Association of Suicidology

A Machine Learning Approach to Identifying

Death by suicide demonstrates profound personal suffering and societal

a 600-bed urban level 1 academic medical Data Collection

1.0 0.8 0.6 0.4 0.2 0.0

1.0 0.8 0.6 0.4 0.2 0.0

1.0 0.8 0.6 0.4 0.2 0.0

Male Female Total Male Female Total Male Female Total

n 39 83 <0.0001 122 46 77 0.007 123 59 67 0.53 126

Male (SD) Female (SD) p value Total (SD)

n 144 227 <0.0001 371

et al., 2014). From the results of other stud-

ies, this was unexpected. Suggesting that

additional research devoted to fusing nonver-

Clinicians may ask: Were there any

differences in what the subjects said?

anger, less emotional pain, and more hope.

differences were found between each of the

groups (Newman, Pennebaker, Berry, &

This studys methodology presents

mine suicidal intention. These computational

shelters, youth clubs, juvenile justice centers,

and community centers, where earlier identi-

Laughed during interview 39.4 69.1 71.9 <0.0001

You might also like