You are on page 1of 2

Commentary

Automatic or the People? Anger on


September 11, 2001, and Lessons
Learned for the Analysis of Large
Digital Data Sets

Psychological Science
22(6) 837838
The Author(s) 2011
Reprints and permission:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/0956797611409592
http://pss.sagepub.com

Mitja D. Back, Albrecht C. P. Kfner, and Boris Egloff


Johannes Gutenberg University Mainz

Received 3/4/11; Revision accepted 3/8/11

In a recent study, we used Linguistic Inquiry and Word Count


(LIWC; Pennebaker, Francis, & Booth, 2001) to conduct an
automatic analysis of emotional words included in 422,502
messages sent to text pagers during September 11, 2001 (Back,
Kfner, & Egloff, 2010). Our results showed that people
(a) did not react primarily with sadness, (b) experienced a
number of event-related outbursts of anxiety, and (c) steadily
became angrier. Because the data contained many technical
codes, we computed for each of 216 time blocks the percentage of sadness-, anxiety-, and anger-related words in the messages, by dividing the number of such words by the total
number of words in the messages that were included in the
LIWC dictionary.
As Pury (2011) intelligibly shows, this control routine was
clearly insufficient. In particular, we did not anticipate that
emotionally irrelevant, automatically generated messages
(i.e., messages that described a critical server problem)
would be incorrectly classified by LIWC as anger related and
at the same time show a nonrandom time course (i.e., a dramatic increase over time). Although this unexpected confound
did not affect our findings for sadness or anxiety, it did distort
our findings for anger. As did our original analysis, Purys
analysis with the automatically generated messages removed
showed a strong increase in anger after the first attack. However, this rise in anger did not continue throughout the day;
thus, Pury found a substantially lower overall correlation
between anger-related words and time than we found in our
original analysis.
What can be learned from this scientific exchange? In a
nutshell, automated text analysis of large digital data sets can
lead to unforeseen confounds. Therefore, careful control
mechanisms need to be implemented for both the preparation
and the analysis of data. In an analysis of the September 11
pager data, it is necessary to (a) eliminate automatically generated messages, retaining meaningful social messages, and
(b) correctly determine the level of emotion in these social
messages. In an extensive reanalysis of the time course of

anger as reflected in the pager data, we tried to tackle these


issues systematically (for details, see the Supplemental Material available online). As it turned out, there seemed to be no
automatic way to unequivocally distinguish between automatic and social messages or to identify anger-related messages (both problems also apply to the procedures outlined by
Pury, 2011).Therefore, in addition to automatic algorithms, we
used human judgment to generate a final data set that contained only social messages (two student assistants and the
three authors classified 201,347 messages as automatic or
social) and to determine the level of anger expressed in each of
the 37,606 social messages identified (three independent student assistants rated anger on a scale from 0, no anger, to 2,
strong anger).
Figure 1 depicts the timelines of (a) the mean percentage of
automatically counted LIWC anger words and (b) the mean
anger rating for the 37,606 social messages. The two analytic
strategies revealed distinct, albeit positively correlated, timelines, r = .50, p < .001. The level of anger was greater after the
attacks than before the attacks both in the LIWC analysis,
t(214) = 3.14, p < .01, and in the anger-rating analysis,
t(214) = 2.07, p < .05. As Purys (2011) analyses suggest, however, the timeline of anger was not as straightforward as indicated in our original analyses. Within the first 3 hr analyzed
(up through 1 hr after the attacks), the LIWC analysis and the
anger-rating analysis showed an increase in anger, r = .69, p <
.001, and r = .42, p < .05, respectively, but the level of anger
decreased to the baseline level by about 5 hr after the attacks
(1:45 p.m.) and then increased again throughout the rest of the
day (LIWC analysis: r = .23, p < .01; anger-rating analysis:
r = .22, p < .05). The overall correlation between time and
anger was nonsignificant, r = .01, in the LIWC analysis and
Corresponding Author:
Mitja D. Back, Department of Psychology, Johannes Gutenberg University
Mainz, Binger Strae 14-16, 55099 Mainz, Germany
E-mail: back@uni-mainz.de

Downloaded from pss.sagepub.com at Stanford University Libraries on January 4, 2015

838

LIWC Anger

0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00

Anger Rating

0.02
0.01

.
:4

a.

.
12

:4

p.

.
10

45

p.

.
8:

45

p.

.
6:

45

p.

.
p.

.
45
2:

12

:4

p.

.
m

10

:4

a.

m
a.
45
8:

m
a.
45
6:

0.00

Mean Anger Rating

0.03

4:

Percentage of Words

Back et al.

Fig. 1. A revised timeline of anger as expressed in 37,606 social messages sent to text pagers on September 11, 2001.
The graphs show (a) the mean percentage of words related to anger (as classified by Linguistic Inquiry and Word Count;
Pennebaker, Francis, & Booth, 2001) and (b) the mean anger rating (0 = no anger, 1 = some anger, 2 = strong anger; averaged
across three raters for each message) across time slots starting at 6:45 a.m. to 7:14 a.m. on September 11, 2001, and ending
at 12:15 a.m. to 12:44 a.m. on September 12, 2001.

significant, r = .19, p < .001, in the anger-rating analysis.


When we included only the 173 time blocks with a mean anger
rating above 0, the overall correlation between time and rated
anger was again significant, r = .40, p < .001. Additional analyses and sources of data will be needed for a thorough evaluation of the course of anger on September 11, 2001.
In conclusion, although the growing availability of large and
highly informative digital data sets has dramatically increased
the potential for progress in analyzing, understanding, and
addressing many major societal problems (King, 2011, p. 719),
analysis of such data sets requires the implementation of
extremely careful control routines. In some cases, in the absence
of more intelligent technical solutions, automatic data preparation and analysis will probably need to be augmented by an
additional (burdensome) source of data: the human observer.
Acknowledgments
We would like to thank Anna Auth, Jasmina Eskic, David Kolar,
Stefan Mayer, and Carmen Mller for their help with data collection.

Declaration of Conflicting Interests


The authors declared that they had no conflicts of interest with
respect to their authorship or the publication of this article.

Supplemental Material
Additional supporting information may be found at http://pss.sagepub
.com/content/by/supplemental-data

References
Back, M. D., Kfner, A. C. P., & Egloff, B. (2010). The emotional
timeline of September 11, 2001. Psychological Science, 21, 1417
1419.
King, G. (2011). Ensuring the data-rich future of the social sciences.
Science, 331, 719.
Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic Inquiry
and Word Count (LIWC): LIWC 2001. Mahwah, NJ: Erlbaum.
Pury, C. L. S. (2011). Automation can lead to confounds in text
analysis: Back, Kfner, and Egloff (2010) and the not-so-angry
Americans. Psychological Science, 22, XXXXXX.

Downloaded from pss.sagepub.com at Stanford University Libraries on January 4, 2015

You might also like