Professional Documents
Culture Documents
www.elsevier.com/locate/schres
Abstract
Studies of neurocognitive function in patients with schizophrenia use widely variable assessment techniques. Clinical trials
assessing the cognitive enhancing effect of new medications have used neurocognitive assessment batteries that differed in
content, length and administration procedures. The Brief Assessment of Cognition in Schizophrenia (BACS) is a newly
developed instrument that assesses the aspects of cognition found to be most impaired and most strongly correlated with
outcome in patients with schizophrenia. The BACS requires less than 35 min to complete in patients with schizophrenia, yields
a high completion rate in these patients, and has high reliability. The BACS was found to be as sensitive to cognitive
impairment in patients with schizophrenia as a standard battery of tests that required over 2 h to administer. Compared to
healthy controls matched for age and parental education, patients with schizophrenia performed 1.49 standard deviations lower
on a composite score calculated from the BACS and 1.61 standard deviations lower on a composite score calculated from the
standard battery. The BACS composite scores were highly correlated with the standard battery composite scores in patients
(r = 0.76) and healthy controls (r = 0.90). These psychometric properties make the BACS a promising tool for assessing
cognition repeatedly in patients with schizophrenia, especially in clinical trials of cognitive enhancement.
D 2003 Elsevier B.V. All rights reserved.
0920-9964/$ - see front matter D 2003 Elsevier B.V. All rights reserved.
doi:10.1016/j.schres.2003.09.011
284 R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297
1996; Green et al., 2000) as they are all correlated with ner et al., 1981), or Alzheimer’s Disease Assessment
poor functional abilities. Scale (Rosen et al., 1984). These instruments can be
Since the second generation of antipsychotic easily administered in typical treatment settings such
medications became widely available, dozens of as a physician’s office, and are routinely employed as
studies of the neurocognitive impact of these medi- indicators of treatment change in clinical trials of
cations have been completed. While most of these anti-dementia drugs. The administration of These
studies have concluded that second generation anti- tests shed light on the global severity of patients’
psychotics improve neurocognitive function, the in- cognitive deficits, track progression, and measure
terpretation of these results has been challenged, in cognitive changes.
part, by the variable test batteries used in each study The availability of a quick and efficient tool for
(Harvey and Keefe, 2001). There is no standard, measuring cognition in patients with schizophrenia
easily administered test battery that specifically and could be an extremely useful guide for clinicians
efficiently assesses the important cognitive deficits in making decisions about potential rehabilitation and
patients with schizophrenia. Instead, current test for researchers implementing clinical trials to assess
batteries differ widely in content, duration, proce- cognitive improvement. There are several batteries of
dures, and implementation. The limited generalizabil- tests that are available or in development for the
ity of neurocognitive findings across studies has purposes of brief cognitive assessment.
reduced the ability of clinicians and researchers to Several computerized batteries, such as the Cam-
make clear conclusions about the relative impact of bridge Neuropsychological Test Automated Battery
the various antipsychotic medications on cognition in (CANTAB) (Robbins et al., 1996), the CDR Cogni-
schizophrenia. The absence of a standard measure tive Assessment System (Hunter et al., 1997), and
has also challenged studies of adjunctive treatments the CogTest Battery (Cogtest, 2002) have been
for cognition in patients with schizophrenia (Fried- applied to schizophrenia samples, and have the
man et al., 2001, 2002; Evins et al., 2000; Heresco- option of reduced length. However, portability, reg-
Levy et al., 1996). ular software and hardware version changes, and
Another drawback of current assessments of cog- increased patient and tester burden present imple-
nition in treatment studies of patients with schizophre- mentation challenges. Other batteries, such as the
nia is that many of the neurocognitive assessment Cognitive Screening Instrument for Schizophrenia
batteries used are long and complex. Most neuropsy- (CSIS), a very short (10 min) set of abbreviated
chological assessment batteries used in schizophrenia tests (Scot Purdon, personal communication), are
studies have been adapted from clinical neuropsychol- currently in development.
ogy, which assesses the entire profile of neuropsycho- The work completed with the Repeatable Battery
logical strengths and weaknesses in individuals. These for the Assessment of Neuropsychological Status
batteries of tests may require several hours to admin- (RBANS) (Randolph, 1998; Gold et al., 1999;
ister. Their adaptation for schizophrenia research has Hobart et al., 1999) has clearly demonstrated the
kept some of the length that was originally necessary utility of brief assessment approaches. The RBANS
for individual assessment, but may no longer be is capable of providing reliable and valid assess-
necessary for research such as clinical trials that ments of patients with schizophrenia in various
compares groups of patients. In fact, the length of cognitive domains (Gold et al., 1999; Hobart et al.,
some assessment batteries may be a limiting factor in 1999; Wilk et al., 2002). As noted in the RBANS
assessing patients repeatedly throughout a clinical trial manual, however, the test was originally developed
(Harvey and Keefe, 2001). as a screening measure primarily for elderly subjects,
In contrast to these standard assessment techniques and the test is heavily weighted by memory, lan-
in schizophrenia research and clinical practice, cog- guage, and visual– perceptual subtests. In addition,
nitive function in patients with dementia is usually the item difficulties are most relevant to the types of
assessed with a widely used cognitive assessment impairment likely to be observed in patients with
tool, such as the Mini Mental Status Examination dementing illnesses such as Alzheimer’s Disease,
(Folstein et al., 1975), Dementia Rating Scale (Gard- with ceiling effects in some domains. Although the
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 285
RBANS is clearly sensitive to some of the impair- Patients were recruited from the inpatient and
ments observed in schizophrenia, and it has utility as outpatient facilities at Duke University, John
a screening tool for patients with schizophrenia, it Umstead Hospital, the University of North Carolina
lacks measures of motor, executive, and working Neurosciences Hospital, and Dorothea Dix Hospital.
memory performance that may be particularly im- Patients were required to meet DSM-IV criteria for
portant targets for cognitive enhancement in schizo- schizophrenia, schizoaffective illness, or schizophre-
phrenia. These limitations suggest the need for a niform disorder, to have no history of brain trauma,
measure specifically designed for use in schizophre- nor to be suffering from a current substance use
nia clinical trials that preserves the desirable features disorder. There were no specific medication criteria
of the RBANS: brief administration and scoring for inclusion in the patient group. Ninety-four of the
time, portablity, repeatability, and availability of 150 patients were being treated with a single atyp-
alternate forms. ical antipsychotic medication (31 with risperidone,
The Brief Assessment of Cognition in Schizophre- 28 with olanzapine, 15 with clozapine, 10 with
nia (BACS) has been developed for clinical trials with aripiprazole, 8 with quetiapine, and 2 with ziprasi-
these key features. In addition, the domains of cogni- done), 13 were being treated with a single typical
tive function that are assessed by the BACS are those antipsychotic (6 with haloperidol, 4 with haldol
found to be consistently impaired, and consistently decanoate, 1 with prolixin, 1 with loxitane, 1 with
related to outcome, in schizophrenia: verbal memory, navane), 9 with a combination of antipsychotics, and
working memory, motor speed, attention, executive 8 with an antipsychotic that was not currently
functions and verbal fluency. The BACS is fully known because the patient was in a double-blind
portable, and is designed to be easily administered by study. Ten were receiving pharmacologic treatment
a variety of testers, including nurse clinicians, psychia- but not with antipsychotics (e.g. ativan, lithium), and
trists, neurologists, social workers, and other mental for 16 patients, medication information was not
health workers. It is designed to require about 30 min of available.
testing time with minimal extra time for scoring and The demographic characteristics of the patients with
minimal training demands. schizophrenia and the healthy controls are described in
The development of a psychological assessment Table 1. Study investigators made a concerted effort to
instrument should determine the instrument’s test – recruit healthy controls who would match the patients
retest reliability, sensitivity, criterion-referenced valid- on parental education, age, ethnic background, and sex.
ity (i.e. comparison to a standard measure), and As indicated in Table 1, groups did not differ signifi-
comparability of alternative forms. This study aims cantly on any of these measures except sex. Males
to determine the quality of the BACS in these were more highly represented in the patient group
domains of reliability and validity. (79%) than in the control group (62%)(Fisher’s exact,
P = 0.023).
Table 1
Demographics of the sample
Measure Schizophrenia Controls t P
N Mean S.D. N Mean S.D.
Age 150 34.7 11.3 50 34.8 11.5 0.058 0.953
Education (years) 145 12.0 2.3 50 13.3 2.3 3.469 0.001
Paternal education 105 12.3 3.5 46 12.2 3.8 0.345 0.731
Maternal education 107 12.8 3.3 50 12.1 3.8 1.066 0.288
WRAT-3, Reading 148 41.5 7.4 50 44.4 8.8 2.336 0.035
Sex N (%) 0.023a
Male 119 (79) 31 (62)
Female 31 (21) 19 (38)
Race N (%) 0.352a
African 70 (50) 28 (56)
Caucasian 65 (46) 22 (44)
Other 5 (4) 0 (0)
WRAT-3 = Wide Range Achievement Test, Third edition.
a
By Fisher’s exact test.
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 287
generate as many words as possible that begin with a alternate forms were standardized separately for each
given letter. Version A: F, S; Version B: P, R version. A composite score was calculated by aver-
Measure: number of words generated per trial. aging all of the six standardized primary measures
from the BACS, and then calculating a z-score of the
2.2.5. Attention and speed of information processing composite. Composite scores were calculated for all
subjects using the a priori criterion that they must
2.2.5.1. Symbol coding. As quickly as possible, have successfully completed all or all but one of the
patients wrote numerals 1 –9 as matches to symbols measures that comprised the composite score. This
on a response sheet for 90 s. Measure: number of criterion included the entire sample, as no subjects
correct numerals (range: 0 –110). were missing more than one BACS measure.
The relationship among the BACS measures was
2.2.6. Executive functions determined by calculating Pearson correlations among
the scores. The factor structure of the scores was
2.2.6.1. Tower of London. Patients were shown two determined by performing a principal components
pictures simultaneously. Each picture showed three analysis with oblique rotation.
balls of different colors arranged on three pegs, with For the standard battery, a composite score was
the balls in a unique arrangement in each picture. calculated by adding together the primary measures
Patients were asked to give the total number of times from each test after they had been standardized.
the balls in one picture need to be moved in order to Standardized scores for each construct (e.g. verbal
make the arrangement of balls identical to that of the memory) were determined by calculating the mean
other, opposing picture. There were 20 trials. The of the standardized scores for each of the measures
items were of variable difficulty, with a general that comprised the construct. A z-score for each
tendency for later items to be more difficult. The measure was calculated based upon the healthy
test was discontinued if patients made five consecu- control mean and standard deviation, and the average
tive incorrect responses. If patients responded cor- of the z-scores was calculated to determine the con-
rectly to all 20 trials, two additional trials of greater struct score. The Dot Test working memory deficit
difficulty were administered. There were two alter- score, Trailmaking A and B scores, and WCST per-
nate forms. Measure: number of correct responses severative error scores were multiplied by 1 so that
(range: 0 – 22). good performance was associated with scores in a
positive direction. The following constructs were cal-
2.3. Data analyses culated using the measures listed in parentheses:
Verbal memory (RVLT total scores, WMS-III logical
For all subtests and composite scores, test – retest memory score), Attention (WAIS-III digit symbol,
reliability was measured with intra-class correlations Mean CPT d-prime), Working Memory (Dot test
(ICC) in the patient and control groups separately. The working memory error, number of correct items on
ICC is a conservative estimate of test –retest reliabil- Letter –Number Sequencing), Motor Speed (Mean of
ity, as it is sensitive to group mean changes over time dominant and nondominant hand Pegboard perfor-
in addition to intra-subject variability. Practice effects mance, Trailmaking A), Verbal Fluency (COWAT,
were measured by comparing data collected at test Category instances), Executive functions (Mean of
session 1 to those collected at test session 2 with WCST categories and perseverative error z-scores,
within-group t-tests. These were determined in the Trailmaking B).
patient and control groups separately. Sensitivity to The relative sensitivity of the composite scores
between-group impairment on all measures was de- derived from the standard battery and the BACS was
termined with independent t-tests. determined by comparing each of their between-
The primary measure from each test of the BACS group differences with t-tests. In addition, multiple
was standardized by creating z-scores whereby the logistic regression analyses were performed to deter-
test session 1 healthy control mean was set to zero mine the amount of unique between-group variance
and the standard deviation set to one. Tests with that could be accounted for by each battery.
288 R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297
Finally, to compare the relation between the primary measures was small. Only the symbol coding
BACS measures and the measures from the standard test showed significant practice effects in patients
battery, the neurocognitive construct summary scores (0.25 S.D.) and controls (0.34 S.D.).
from the standard battery were correlated with the
BACS measures using Pearson correlations. 3.1.2. Comparability of versions: test – retest reliabil-
ity, sensitivity, and practice effects of tests with alter-
nate forms
3. Results Table 3 lists the means and standard deviations for
the BACS measures derived from tests with alternate
3.1. Description of BACS data forms. The data are organized by the sequence of the
BACS versions subjects received in test sessions 1
3.1.1. Test –retest reliability, sensitivity and practice and 2. The ICCs between performance for test session
effects of tests without alternate forms 1 and test session 2 for each measure are also
Table 2 lists the means and standard deviations for included. Subjects who received the same version
all of the measures from the BACS that do not have twice (AA and BB) allow for an assessment of
alternate forms. These measures were repeated with maximal potential practice effects and test – retest
the same form regardless of which version of the reliability. Subjects who received a different version
BACS was administered. All tests demonstrated sig- on consecutive test sessions (AB and BA) allow an
nificant differences ( P < 0.001) between controls and assessment of general practice effects that are not
patients. The intra-class correlations (ICC) between dependent upon test version, as well as the compara-
performance from test session 1 and test session 2 for bility of the test forms.
each measure are also included. Each test had one
measure that produced an ICC of 0.79 or greater for 3.1.2.1. Verbal memory. The word lists from each
the patient group and healthy control group: total version were very similar in difficulty in patients and
number of correct responses from the digit sequencing controls. The final versions of the lists were the result
task; correct responses from the symbol coding task; of a reorganization of the original word lists accom-
and total number of tokens for 60 s from the token plished by switching six of the words between the A
motor task. The longest correct sequence measure for list and the B list, keeping the words in the same or
the digit sequencing tests and the 30-s measures for similar ordinal position as the original list. Since the
the token motor task produced ICCs that were inferior samples were smaller for the comparison of the final
to the primary measures, and were thus not used in lists, we collected data on an additional 30 controls
subsequent analyses. The effect of practice on the from Mount Sinai Medical Center not included in the
Table 2
Mean performance and reliability coefficients of BACS tests with no alternate forms in patients with schizophrenia and healthy controls
Measure Schizophrenia Controls P1 P2
Test session 1 Test session 2 ICC Test session 1 Test session 2 ICC
N Mean S.D. N Mean S.D. N Mean S.D. N Mean S.D.
Digit sequencing, total correct 140 15.01 4.65 140 15.71 4.60 0.79 50 19.08 5.27 50 19.26 4.88 0.81 0.007 0.686
Digit sequencing, longest 140 5.76 1.40 140 5.89 1.36 0.66 50 6.58 1.34 50 6.72 1.28 0.61 0.183 0.398
correct sequence
Token motor task (1st 30 s) 142 28.89 8.66 142 29.96 8.23 0.76 50 36.86 8.55 50 38.40 8.10 0.81 0.030 0.038
Token motor task (2nd 30 s) 142 29.37 8.73 142 29.18 8.59 0.71 50 37.40 7.75 50 38.52 7.61 0.54 0.723 0.287
Token motor task total 142 58.27 16.57 142 59.13 16.10 0.79 50 74.26 14.50 50 76.92 14.91 0.80 0.333 0.050
Symbol coding 143 36.63 14.23 143 40.48 16.15 0.90 50 54.86 16.55 50 60.80 18.25 0.83 0.000 0.000
P1 = Significance value for patients day 1 vs. patients day 2, by t-test; P2 = Significance value for controls day 1 vs. controls day 2, by t-test.
Significance value for all patients day 1 vs. controls day 1 comparisons, P < 0.001, by t-test; Significance value for all patients day 2 vs. controls
day 2 comparisons, P < 0.001, by t-test.
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 289
original sample. In this separate sample, the word list The practice effect was significant in patients and
difficulty was almost identical, and not significantly controls who were administered the same version,
different (A list: mean = 50.33, S.D. = 9.77; B list but not in subjects who received different versions
mean = 51.00, S.D. = 9.48; t = 0.19, df = 28, n.s.). on consecutive test sessions. Controls performed
The ICCs for this measure ranged between 0.78 significantly better than patients for each version of
and 0.93 in patients and 0.40 and 0.90 in controls. the test. These results support the need for alternate
Due to a randomization procedure that resulted in forms for this test.
fewer than expected controls with an AB or BA
sequence of tests using the final word list, we 3.1.2.2. Verbal fluency. Letter Fluency: In subjects
administered the word lists only to an additional 11 receiving the same version on consecutive test ses-
controls not otherwise included in the study to sions, the ICCs were good for the patient group,
increase the sample size for assessing the test – retest ranging between 0.69 and 0.78, and lower for the
reliability in controls receiving different versions. controls, ranging between 0.53 and 0.65. In subjects
Table 3
Mean performance and reliability coefficients of BACS tests with alternate forms in patients with schizophrenia and healthy controls
Condition Schizophrenia Controls Significance
Test session 1 Test session 2 ICC Test session 1 Test session 2 ICC P1 P2 P3 P4
N Mean S.D. N Mean S.D. N Mean S.D. N Mean S.D.
Verbal memory
A/A 18 33.28 9.38 17 38.65 9.68 0.78 7 51.14 8.11 7 57.71 9.21 0.80 0.003 0.019 0.000 0.000
B/B 16 33.13 12.84 15 40.67 16.35 0.93 9 50.89 11.49 9 64.44 9.14 0.68 0.000 0.001 0.002 0.001
A/B 6 32.50 10.48 6 33.83 14.43 0.92 8 58.40 2.61 8 56.20 2.28 0.40 0.549 0.149 0.074 0.252
B/A 11 33.91 11.87 11 33.00 10.83 0.86 5 51.40 9.02 5 52.40 9.94 0.90 0.584 0.655 0.634 0.550
Verbal fluency
A/A
F 50 10.62 4.02 48 12.00 4.58 0.69 18 14.22 3.90 18 16.17 3.85 0.56 0.004 0.037 0.002 0.001
S 50 11.46 4.36 48 12.00 4.86 0.69 18 16.61 4.73 18 17.72 3.77 0.53 0.292 0.271 0.000 0.000
Supermarkt 50 20.26 7.44 48 20.44 6.73 0.71 18 26.06 6.45 18 26.83 5.88 0.90 0.597 0.240 0.005 0.001
B/B
P 51 10.49 5.08 48 11.56 5.23 0.77 16 14.00 5.94 16 15.69 6.18 0.65 0.031 0.206 0.024 0.011
R 51 8.37 4.40 48 9.31 4.67 0.78 16 14.38 6.53 16 14.00 6.08 0.58 0.035 0.798 0.000 0.002
Tools 51 9.43 3.92 48 9.98 3.83 0.82 16 12.38 6.59 16 15.13 9.13 0.82 0.175 0.038 0.032 0.002
A/B
F/P 24 9.58 4.58 24 10.13 4.41 0.57 8 12.38 3.66 8 15.88 4.52 0.15 0.530 0.108 0.129 0.003
S/R 24 9.50 3.84 24 8.21 4.04 0.60 8 14.63 2.97 8 10.13 4.58 0.13 0.084 0.041 0.002 0.270
Sup/Tools 24 18.54 5.77 24 9.63 3.89 0.46 8 28.75 6.34 8 13.13 5.36 0.81 0.000 0.000 0.000 0.054
B/A
P/F 25 9.76 2.92 24 11.42 4.51 0.12 8 14.00 4.28 8 12.63 3.29 0.61 0.164 0.287 0.003 0.492
R/S 25 8.40 3.32 24 10.88 4.40 0.35 8 10.75 5.04 8 13.38 4.14 0.53 0.017 0.141 0.135 0.168
Tools/Sup 25 9.20 4.88 24 17.75 5.82 0.51 8 11.5 2.73 8 26.50 7.05 0.38 0.000 0.000 0.216 0.001
Tower of London
A/A 50 12.46 4.72 48 13.75 4.80 0.66 18 15.39 3.18 18 16.83 2.75 0.83 0.017 0.002 0.018 0.013
B/B 51 10.90 5.52 46 11.78 5.72 0.77 16 17.19 2.93 16 18.69 2.50 0.73 0.385 0.009 0.000 0.000
A/B 23 14.00 3.30 23 14.00 4.34 0.81 8 16.38 4.31 8 17.13 3.72 0.97 1.00 0.080 0.116 0.080
B/A 25 13.76 5.54 24 14.50 6.41 0.89 8 14.75 3.99 8 14.38 4.21 0.93 0.330 0.504 0.644 0.959
P1 = Significance value for patients test session 1 vs. patients test session 2, by t-test; P2 = significance value for controls test session 1 vs.
controls test session 2, by t-test; P3 = significance value for patients test session 1 vs. controls test session 1, by t-test; P4 = significance value for
patients test session 2 vs. controls test session 2, by t-test.
290 R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297
Table 4
Mean performance and reliability coefficients of BACS composite scores in patients with schizophrenia and healthy controls
Condition Schizophrenia Controls Significance
Test session 1 ICC Test session 2 Test session 1 ICC Test session 2 P1 P2 P3 P4
N Mean S.D. N Mean S.D. N Mean S.D. N Mean S.D.
A/A 50 1.70 1.14 0.88 48 1.20 1.25 18 0.08 1.04 0.94 18 0.43 1.19 0.000 0.000 0.000 0.000
B/B 51 1.36 1.01 0.86 48 1.00 0.90 16 0.16 1.08 0.95 16 0.71 1.23 0.000 0.000 0.000 0.000
A/B 24 1.80 1.15 0.92 24 2.03 1.31 8 0.17 0.94 0.87 8 0.23 1.24 0.036 0.080 0.000 0.002
B/A 25 1.06 0.94 0.88 24 0.60 0.81 8 0.32 0.80 0.87 8 0.00 0.90 0.000 0.077 0.054 0.091
P1 = Significance value for patients test session 1 vs. patients test session 2, by t-test; P2 = significance value for controls test session 1 vs.
controls test session 2, by t-test; P3 = significance value for patients test session 1 vs. controls test session 1, by t-test; P4 = significance value for
patients test session 2 vs. controls test session 2, by t-test.
receiving different versions, the ICCs were variable, items was more sensitive to the differences between
but poor overall, with three ICCs below 0.17 and none patients and controls than tools. These results sug-
higher than 0.60. gest that the category of supermarket items is a
The practice effect was small in those receiving the superior measure.
same version on consecutive test sessions (less than
0.33 S.D. for all letters), and was not statistically 3.1.2.3. Tower of London. Versions A and B were
significant in controls and patients for F and S words. very similar in patients and controls, with no signif-
Performance for patients was significantly different icant differences between versions. The ICCs ranged
from controls for each of the letters. between 0.66 and 0.89 in patients and between 0.73
Category Instances: The ICCs for subjects receiv- and 0.96 in controls. The practice effect was small in
ing the same version on consecutive test sessions patients ( < 0.25 S.D.) and medium in controls (0.4 –
were good, ranging from 0.71 to 0.90, but poor in 0.6 S.D.). The sensitivity of the test to the differences
subjects receiving different versions on consecutive between patients and controls was weaker than the
test sessions (0.46 –0.51 in patients; 0.38– 0.81 in other measures, although they were statistically sig-
controls). nificant in the larger AA and BB samples.
Patients and controls performed about 2 S.D.
better on Version A (supermarket items) than Version 3.1.3. BACS composite scores and BACS profile
B (tools). The practice effect was statistically signif- Table 4 lists the composite scores for controls and
icant for the tools category, but not for supermarket patients receiving the various combinations of the two
items in controls. The pattern of between-group versions of the BACS. Note that all data were stan-
results suggested that the category of supermarket dardized using test session 1 means and standard
Fig. 1. Performance of patients with schizophrenia on the BACS subtests and composite score standardized to healthy controls. All differences
between patients and controls were statistically significant ( P < 0.001).
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 291
Fig. 2. Composite scores for the BACS and standard battery in schizophrenia patients standardized to healthy controls.
ences between males (0.27 F 1.05) and females tests. The table includes the group mean and
( 0.45 F 0.71) (t = 2.65, df = 48, P = 0.011) in the standard deviation for the primary measure of each
healthy control group were consistent with the signif- test and the z-scores for the patients. The least
icantly higher education and WRAT-Reading scores sensitive measures in the battery were letter –num-
for the males in this sample. ber sequencing and WCST categories, coinciding
with the two least sensitive measures in the BACS,
3.1.4. Correlations among BACS measures and factor digit sequencing (measuring verbal working memo-
analysis ry) and the Tower of London test (measuring ex-
Table 5 presents the correlations among the prima- ecutive function).
ry BACS measures for patients (below the diagonal)
and controls (above the diagonal). All but 3 of the 42 3.3. Relationship of standard battery and BACS
correlations were significant at the P < 0.05 level. The
correlations were of slightly greater magnitude in the 3.3.1. Testing duration
controls, yet the pattern of correlations between the The BACS required a mean of 34.2 min for
groups were very similar. The pattern of correlations patients (S.D. = 8.95) and 30.4 (S.D. = 3.88) min for
between each of the measures and the composite
scores, although greater in the controls, was strikingly
Table 8
similar between groups. Pearson correlations for standard battery domains and BACS
Principal components analysis with oblique rota- measures in healthy controls
tion was completed to determine the factor structure Long battery BACS measures
of the BACS. The factor loadings are presented in domains VM DS VF SC TM TL Comp
Table 6. The factor structure suggests a three-factor
Verbal memory 0.57 0.77 0.53 0.66 0.61 0.35 0.79
solution. Measures that emphasize motor speed and Attention 0.33 0.72 0.56 0.70 0.60 0.63 0.79
general cognitive functions load on the first factor; the Working memory 0.41 0.61 0.48 0.71 0.58 0.52 0.76
memory and working memory measures load on the Motor speed 0.20 0.58 0.32 0.52 0.38 0.40 0.56
second factor, and executive function loads on the Verbal fluency 0.45 0.65 0.75 0.63 0.60 0.40 0.80
third factor. Executive function 0.35 0.67 0.57 0.67 0.60 0.45 0.75
Composite score 0.48 0.63 0.65 0.78 0.68 0.55 0.90
BACS measures: VM = Verbal memory; DS = Digit sequencing;
3.2. Description of standard battery data
VF = Verbal fluency; SC = Symbol coding; TM = Token motor task;
TL = Tower of London; Comp = Composite score.
Table 7 presents the performance of schizophren- Correlations greater than and or equal to 0.48, P < 0.001; 0.40 –
ic patients and controls on the standard battery of 0.47, P < 0.01; 0.32 – 0.39, P < 0.05.
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 293
Fig. 3. Scatterplot of BACS and standard battery composite scores for patients and controls.
294 R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297
When only those subjects who received version A percentage of patients were able to complete the tests
at test session 1 were included in the analysis, BACS in the BACS compared to the standard measures,
composite scores accounted for 32.6% of the between- especially those administered with computerized
group variance in diagnosis ( F = 37.38; df = 1,98; methods. The BACS offers promise as a tool for
P < 0.001). The standard battery composite score assessing cognition repeatedly over the course of
accounted for only 0.1% additional variance, which treatment in patients with schizophrenia.
was not statistically significant. When the order of
entry was reversed, BACS version A composite scores 4.1. Psychometrics
accounted for 5.3% additional variance ( F = 36.95;
df = 1,98; P < 0.01) beyond the standard battery com- The intraclass correlation coefficients calculated
posite scores. from the data in this study suggested that the composite
When only subjects who received version B at test scores derived from each subtest had very good test –
session 1 were included in the analysis, BACS com- retest reliability in patients with schizophrenia and
posite scores accounted for 23.0% of the between- healthy controls. These reliability coefficients were
group variance in diagnosis ( F = 29.22; df = 1,98; high not only when the same version of the test was
P < 0.001). The standard battery composite score administered on consecutive test sessions, but also
accounted for 7.6% additional variance ( F = 10.65; when different versions (alternate forms) of the test
df = 1,97; P < 0.01). When the order of entry was were administered. The benefit of high reliability on the
reversed, BACS version B composite scores BACS composite score is the increased likelihood that
accounted for only 0.1% additional variance, which change over time on this measure, such as with treat-
was not statistically significant. ment, can be interpreted as due to a nonrandom effect.
The primary measures from the subtests without
3.3.4. Correlations between BACS measures and alternate versions, including tests of motor speed,
standard battery constructs working memory, verbal fluency and information pro-
The correlations between the standard battery con- cessing speed, were highly reliable, with ICCs equal to
structs and the BACS measures (Test session 1 data or greater than 0.79. The practice effects of these tests
only) are presented in Table 8 for controls and Table 9 were minimal, with none of the improvements exceed-
for patients. The correlations were slightly higher in ing 0.25 standard deviations, despite the potential
the controls, yet the pattern of correlation magnitude effect of practice being maximized by testing twice
was similar between groups. The highest correlation within 5 days.
in each matrix was between the standard battery and The comparisons of the subtests from the alternate
BACS composite scores, which was 0.76 for patients forms suggested that alternate forms for the verbal
and 0.90 for controls. The individual data points from memory and Tower of London tests are necessary, but
this correlation are presented in Fig. 3. a single version measuring verbal fluency is sufficient.
Regarding verbal memory, the lists proved to be
very similar in difficulty and highly reliable. The
4. Discussion practice effect across these versions was very small
compared to the practice effect using the same ver-
The Brief Assessment of Cognition in Schizophre- sions. Thus, the use of alternate forms for the verbal
nia, or BACS, an easily administered pen- and -paper memory test is necessary, and will facilitate the
battery of neurocognitive tests, demonstrated high assessment of changes in verbal memory abilities that
reliability and concurrent validity with a standard are independent from learning the words in a previous
battery of tests in schizophrenia patients and healthy administration.
controls with similar ages, racial backgrounds and The reliability of the verbal fluency measures was
parental education. The BACS was as sensitive to the found to be much higher when the same versions
cognitive deficits of schizophrenia as a standard were used on consecutive test sessions. The effect of
battery of neuropsychological tests that took nearly practice on these measures was very small, even
four times as long to administer. Further, a greater when the same version was administered on consec-
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 295
utive test sessions. In addition, there was variable important for assessing change over time, such as in
sensitivity of the different versions, particularly in clinical trials.
the domain of category fluency, with supermarket
items being the most sensitive. This pattern of results 4.2. Validity
suggests that with regard to category instances,
supermarket items will be the most reliable and The composite scores from the BACS and the
sensitive category to use for both versions of the standard battery were highly correlated in patients
BACS. There was little differentiation in reliability, and in controls, and neither battery was more sensitive
practice effects, and sensitivity to group differences to the overall deficits found in patients with schizo-
among the various letter categories for the Controlled phrenia. The magnitude of these deficits, which was
Oral Word Association Test. It appears as though any about 1.5 standard deviations below the healthy con-
combination of letters is satisfactory as long as it is trols in this study, were consistent with those reported in
consistent on consecutive testing periods. meta-analyses of studies of neurocognitive impairment
The reliability of the Tower of London was high, in schizophrenia (Heinrichs and Zakzanis, 1998), yet
even when different versions were administered on not as severe as estimated from a more selective review
consecutive test sessions. The practice effects were of the literature (Harvey and Keefe, 1997). Studies such
small in patients and medium in controls when the as this one that pay special attention to matching groups
same versions were used on consecutive test sessions. on age and parental education may yield differences
However, these practice effects were diminished when that are less robust than those that do not attend to
a different version was used on consecutive test these factors (Heinrichs and Zakzanis, 1998).
sessions. These data suggest that using an alternate The differences between patients and controls may
form for this test is helpful to reduce practice effects, appear to be smaller than expected on the Tower of
and recommended. London test, measuring executive functions, and the
The pattern of results regarding the reliability, digit sequencing test, which measures a verbal form of
practice effects, and between-group sensitivity of tests working memory. While these measures may initially
that had alternate forms in this study suggest that the appear to be less sensitive to the neurocognitive deficits
final BACS should include alternate forms for verbal of schizophrenia, it is possible that the between-group
memory and Tower of London tests only. The tests differences on these measures were relatively small due
without alternate forms—digit sequencing, symbol to the particular cohorts tested, as the measures from
coding, and the token-motor task—had minimal prac- the standard battery used to assess these cognitive
tice effects. Due to the reduced reliability that results domains were similarly small in their differences be-
from alternate forms of verbal fluency measures, and tween patients and controls. Furthermore, the magni-
due to the minimal practice effects that result from tude of these deficits are consistent with the effect sizes
administering the same version consecutively, the reported in the meta-analysis by Heinrichs and Zakza-
verbal fluency tests should not have alternate forms. nis (1998).
Based upon the slightly higher reliability, smaller The factor analyses and correlations among the
practice effect, and increased sensitivity to between- BACS measures suggests that while there is a single
group differences, the best verbal fluency categories general factor of cognitive performance that can be
to use in the BACS are F and S for letter fluency and derived from the BACS raw scores, there are two
supermarket items for category fluency. Thus, the other relatively dissociable domains of performance
final version of the BACS that is recommended is measured by the BACS. The pattern of correlations
one in which alternate forms are used for verbal among the BACS measures was very similar in
memory and Tower of London subtests only, and a patients and controls. In both groups, each of the
single version of verbal fluency is used that includes individual measures demonstrated high correlations
supermarket items, F-words and S-words. It should with the composite scores. Factor analysis conducted
also be noted that all of the final measures of the on the data collected only from patients suggests that
BACS have distributional properties suggesting min- the first factor, which accounted for the largest
imal ceiling- and floor-effects. These properties are amount of variance, is a factor of general cognitive
296 R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297
performance with an emphasis on speed and the Finally, a complete validation of the BACS will
generation of action. The other two factors appeared require assessments in various populations of patients
to reflect more discrete functions, with the second with schizophrenia, such as first episode populations,
factor including measures of memory and the third treatment refractory schizophrenia patients, and geri-
including executive functions. The BACS thus ena- atric patients. It will also be important to determine
bles an assessment of overall cognitive function as whether BACS scores can predict changes in func-
well as scores on individual cognitive domains. tional outcome, such as activities of daily life, and
The definitive validation of the BACS will require whether the BACS is sensitive to cognitive changes
further study. First, the comparisons of the final word during clinical trials. Studies are underway to deter-
lists for the verbal memory test had sample sizes that mine the validity of the BACS in these areas of
were reduced, which resulted in less statistical power inquiry.
available for these analyses. However, the final lists In sum, the BACS assesses the major constructs of
appear to have remarkable similarity in their sensitiv- cognition that have been found to be most impaired
ity to group differences. This similarity was also and most strongly correlated with outcome in patients
found in a separate group of controls that was not a with schizophrenia. The BACS takes less than 35 min
part of this study sample. to complete in patients with schizophrenia, yields a
Second, we chose to test subjects twice within 1 high completion rate in these patients, and has high
week to minimize the impact of changes in clinical test – retest reliability over a period of days. It is as
state and medication, reduce drop out, and maximize sensitive to cognitive impairment in patients with
practice effects. However, the assessment of a treat- schizophrenia as a standard battery of tests that
ment effect in research studies or clinical purposes is required over 2 h to complete. While the sensitivity
likely to be measured with longer time between of the BACS to change over time can only be
assessments. Thus, the reliability of these measures determined through longitudinal treatment studies,
in actual practice may be lower and the practice effects 10 of which are currently underway, its psychometric
of the tests in the BACS are likely to be smaller. properties make it a promising tool for assessing
Third, one of the drawbacks of a focus on com- cognition repeatedly in patients with schizophrenia.
posite scores in the evaluation of cognition in schizo-
phrenia is that isolated yet important cognitive effects
may be missed. It is possible that a new medication Acknowledgements
may improve an important aspect of cognition that is
not assessed by the BACS. However, the importance This work was supported by a grant from Eli Lilly.
of general cognitive effects can be seen in the relative Mark Appelbaum provided statistical consultation.
size of the correlations between functional outcome Neurocognitive data were collected, in part, by Adam
and different aspects of cognition. The Pearson corre- Vaughn, Susan Shortel, Matthew Dukes, Trina Walker
lations between outcome and individual aspects of and Joseph Kang. Healthy controls at Mount Sinai
cognition such as memory, attention, and executive were assessed by Adam Brickman and Julie Kim.
functions are relatively small, ranging between 0.20
and 0.34 (Green, 2002) The few studies that have
References
enabled the determination of the correlation of out-
come with overall summary scores of cognition have
Cogtest plc. Cogtest(tm): Computerised Cognitive Battery for Clin-
demonstrated far larger correlations, exceeding 0.60 ical Trials. (2002). Retrieved November 8, 2002, from http://
(Green, 2002). These data suggest that the determi- www.cogtest.com.
nation of cognition with summary data, such as the Cornblatt, B.A., Keilp, J.G., 1994. Impaired attention, genetics,
BACS composite scores reported here, may be the and the pathophysiology of schizophrenia. Schizophr. Bull.
most powerful indicator of functional outcome, and 20, 31 – 46.
Evins, A.E., Fitzgerald, S.M., Wine, L., Rosselli, R., Goff, D.C.,
has the most promise as a target for cognitive en- 2000. Placebo-controlled trial of glycine added to clozapine in
hancement that will produce real-world changes in schizophrenia. Am. J. Psychiatry 157 (5), 826 – 828.
patients lives outside the laboratory. Folstein, M.F., Folstein, S.E., McHugh, P.R., 1975. ‘‘Mini-mental
R.S.E. Keefe et al. / Schizophrenia Research 68 (2004) 283–297 297
state’’. A practical method for grading the cognitive state of Hobart, M.P., Goldberg, R., Bartko, J.J., Gold, J.M., 1999. Repeat-
patients for the clinician. J. Psychiatr. Res. 12 (3), 189 – 198. able battery for the assessment of neuropsychological status as a
Friedman, J.I., Adler, D.N., Temporini, H.D., Kemether, E., Harvey, screening test in schizophrenia: II. Convergent/discriminant val-
P.D., White, L., Parrella, M., Davis, K.L., 2001. Guanfacine idity and diagnostic group comparisons. Am. J. Psychiatry 156
treatment of cognitive impairment in schizophrenia. Neuropsy- (12), 1951 – 1957.
chopharmacology 25 (3), 402 – 409. Hunter, R., Cameron, S., Perks, S., Wesnes, K., 1997. The cognitive
Friedman, J.I., Adler, D.N., Evelyn, H., Harvey, P.D., Brenner, G., profile of unmedicated schizophrenic patients in relation to con-
Temporini, H., White, L., Parrella, M., Davis, K.L., 2002. A trols. J. Psychopharmacol. 11, A74 (Suppl.).
double-blind placebo controlled trial of donepezil adjunctive Keefe, R.S.E., 1995. The contribution of neuropsychology to psy-
treatment to risperidone for the cognitive impairment of schiz- chiatry. Am. J. Psychiatry 152, 6 – 15.
ophrenia. Biol. Psychiatry 51 (5), 349 – 357. Keefe, R.S.E., Lees-Roitman, S., Dupre, R.L., 1997. Performance
Gardner Jr., R., Oliver-Munoz, S., Fisher, L., Empting, L. 1981. of patients with schizophrenia on a pen and paper visuospatial
Mattis Dementia Rating Scale: internal reliability study using a working memory task with short delay. Schizophr. Res. 26 (1),
diffusely impaired population. J. Clin. Neuropsychol. 3 (3), 9 – 14.
271 – 275. Lezak, M., 1995. Neuropsychological Assessment, 3rd ed. Oxford
Geffen, G.M., Butterworth, P., Geffen, L.B., 1994. Test – retest reli- Univ. Press, New York, NY.
ability of a new form of the Auditory Verbal Learning Test Randolph, C., 1998. Repeatable Battery for the Assessment of
(AVLT). Arch. Clin. Neuropsychol. 9, 303 – 316. Neuropsychological Status. Psychological, San Antonio.
Gold, J.M., Carpenter, C., Randolph, C., Goldberg, T.E., Weinberg- Reitan, R.M., 1979. Manual for the Administration of Neuropsy-
er, D.R., 1997. Auditory working memory and Wisconsin Card choligcal Test Batteries for Adults and Children. Ralph M. Re-
Sorting Test performance in schizophrenia. Arch. Gen. Psychia- itan, Tucson, AZ.
try 54 (2), 159 – 165. Robbins, T.W., James, M., Owen, A.M., Sahakian, B.J., McInnes,
Gold, J.M., Queern, C., Iannone, V.N., Buchanan, R.W., 1999. L., Rabbitt, P.M., 1996. A neural systems approach to the cog-
Repeatable battery for the assessment of neuropsychological nitive psychology of aging: studies with CANTAB on a large
status as a screening test in schizophrenia: I. Sensitivity, reli- sample of the normal elderly population. In: Rabbitt, P.M. (Ed.),
ability, and validity. Am. J. Psychiatry 156 (12), 1944 – 1950. Methodology of Frontal and Executive Function. Lawrence Erl-
Green, M.F., 1996. What are the functional consequences of neuro- baum Associates, Hove, pp. 215 – 238.
cognitive deficits in schizophrenia? Am. J. Psychiatry 153, Rosen, W.G., Mohs, R.C., Davis, K.L., 1984. A new rating scale for
321 – 330. Alzheimer’s disease. Am. J. Psychiatry 141 (11), 1356 – 1364.
Green, M.F., 2002. Mediators between neurocognitive deficits and Ruff, R.M., Parker, S.B., 1993. Gender- and age-specific changes in
community outcome in schizophrenia. J. Adv. Schizophr. Brain motor speed and eye – hand coordination in adults: normative
Res. 4, 3 – 7. values for the Finger Tapping and Grooved Pegboard Tests.
Green, M.F., Kern, R.S., Braff, D.L., Mintz, J., 2000. Neurocogni- Percept. Mot. Skills 76, 1219 – 1230.
tive deficits and functional outcome in schizophrenia: are we Saykin, A.J., Gur, R.C., Gur, R.E., Mozley, P.D., Mozley, L.H.,
measuring the ‘‘right stuff’’. Schizophr. Bull. 26 (1), 119 – 136. Resnick, S.M., Kester, D.B., Stafiniak, P., 1991. Neuropsycho-
Harvey, P.D., Keefe, R.S.E., 1997. Cognitive impairment in schiz- logical function in schizophrenia. Selective impairment in mem-
ophrenia and implications atypical neuroleptic treatment. CNS ory and learning. Arch. Gen. Psychiatry 48 (7), 618 – 624.
Spectr. 2, 1 – 11. Wechsler, D., 1997a. Wechsler Adult Intelligence Scale, 3rd ed.
Harvey, P.D., Keefe, R.S.E., 2001. Studies of cognitive change with Psychological, San Antonio, TX. WAIS-III.
treatment in schizophrenia. Am. J. Psychiatry 158, 176 – 184. Wechsler, D., 1997b. Wechsler Memory Scale, 3rd ed. The Psycho-
Heaton, R.K., 1993. Wisconsin Card Sorting Test Manual. Psycho- logical, San Antonio, TX. WMS-III.
logical Assessment Resources, Odessa, FL. Wilk, C.M., Gold, J.M., Bartko, J.J., Dickerson, F., Fenton, W.S.,
Heinrichs, R.W., Zakzanis, K.K., 1998. Neurocognitive deficit in Knable, M., Randolph, C., Buchanan, R.W., 2002. Test – retest
schizophrenia: a quantitative review of the evidence. Neuropsy- stability of the repeatable battery for the assessment of neuro-
chology 12 (3), 426 – 445. psychological status in schizophrenia. Am. J. Psychiatry 59 (5),
Heresco-Levy, U., Silipo, G., Javitt, D.C., 1996. Glycinergic aug- 838 – 844.
mentation of NMDA receptor-mediated neurotransmission in Wilkinson, G.S., 1993. The Wide Range Achievement Test 3 Ad-
the treatment of schizophrenia. Psychopharmacol. Bull. 32 (4), ministration Manual. Wide Range, Wilmington, DE.
731 – 740.