2009 DOF Manual

Manual for the ASEBA
Direct Observation Form

Stephanie H. McConaughy
& Thomas M. Achenbach
ibrary of Congress Control Number: xxxxxxx
ISBN 978-1-932975-12-3
t
h
e

A
S
E
B
A

D
i
r
e
c
t

O
b
s
e
r
v
a
t
i
o
n

F
o
r
m
M
c
C
o
n
a
u
g
h
y

&

A
c
h
e
n
b
a
c
h
a
l

f
o
r

t
h
e

A
S
E
B
A

D
i
r
e
c
t

O
b
s
e
r
v
a
t
i
o
n

F
o
r
m
M
c
C
o
n
a
u
g
h
y

&

A
c
h
e
n
b
a
c
h
Manual for the ASEBA
Direct Observation Form
Stephanie H. McConaughy, University of Vermont
& Thomas M. Achenbach, University of Vermont
ii
Ordering Information
This Manual and other ASEBA materials can be ordered from:
ASEBA Fax: 802-656-5131
1 South Prospect Street E-mail: mail@ASEBA.org
Burlington, VT 05401-3456 Web: www.ASEBA.org
Proper bibliographic citation for this Manual:
McConaughy, S. H., & Achenbach, T. M. (2009). Manual for the ASEBA Direct Observation Form. Burlington,
VT: University of Vermont, Research Center for Children, Youth, & Families.
Related Books
Achenbach, T.M., (2009). The Achenbach System of Empirically Based Assessment (ASEBA): Devel-
opment, Findings, Theory, and Applications. Burlington, VT: University of Vermont, Research Center
for Children, Youth, & Families.
Achenbach, T.M., & McConaughy, S.H. (2009). School-Based Practitioners Guide for the Achenbach Sys-
tem of Empirically Based Assessment (ASEBA) (6
th
ed.). Burlington, VT: University of Vermont, Research
Center for Children, Youth, & Families.
Achenbach, T.M., Pecora, P.J., & Wetherbee, K.M. (2009). Child and Family Service Workers Guide for the
Achenbach System of Empirically Based Assessment (ASEBA) (6
th
ed.). Burlington, VT: University of Ver-
mont, Research Center for Children, Youth, & Families.
Achenbach, T.M., & Rescorla, L.A. (2001). Manual for the ASEBA School-Age Forms & Profiles. Burlington,
VT: University of Vermont, Research Center for Children, Youth, & Families.
Achenbach, T.M., & Rescorla, L.A. (2009). Mental Health Practitioners Guide for the Achenbach System of
Empirically Based Assessment (ASEBA) (6
th
ed.). Burlington, VT: University of Vermont, Research Center
for Children, Youth, & Families.
Achenbach, T.M., & Rescorla, L.A. (2007). Multicultural Guide for the ASEBA School-Age Forms & Pro-
files. Burlington, VT: University of Vermont, Research Center for Children, Youth, and Families.
Achenbach, T.M., & Rescorla, L.A. (2007). Multicultural Understanding of Child and Adolescent Psychopa-
thology: Implications for Mental Health Assessment. New York: Guilford Press.
Achenbach, T.M., & Ruffle, T.M. (2007). Medical Practitioners Guide for the Achenbach System of Empiri-
cally Based Assessment (ASEBA) (5
th
ed.). Burlington, VT: University of Vermont, Research Center for Chil-
dren, Youth, & Families.
McConaughy, S.H. (2005). Clinical Interviews for Children and Adolescents: Assessment to Intervention.
New York: Guilford Press.
McConaughy, S.H., & Achenbach, T.M. (2001). Manual for the Semistructured Clinical Interview for Chil-
dren and Adolescents (2
nd
ed.). Burlington, VT: University of Vermont, Research Center for Children, Youth,
& Families.
McConaughy, S.H., & Achenbach, T.M. (2004). Manual for the Test Observation Form for Ages 2-18.
Burlington, VT: University of Vermont, Research Center for Children, Youth, & Families.
Copyright 2009 S.H. McConaughy & T.M. Achenbach. All rights reserved.
Unauthorized reproduction prohibited by law.
ISBN 978-1-932975-12-3 Library of Congress xxxxxxxxxxx
Printed in the United States of America 14 13 12 11 10 9 8 7 6 5 4 3 2 1
405
iii
User Qualifications
The Direct Observation Form (DOF) is designed
for rating observations of 6-11-year-old children in
school classrooms, at recess, and in other group set-
tings. Observers should have some knowledge of
child behavior and development and of the method-
ology of behavioral assessment. Observers may be
paraprofessionals, such as teachers aides, under-
graduate or graduate students, and research assistants,
as well as professionals in education, school psychol-
ogy, clinical psychology, and related disciplines.
Paraprofessionals and students should use the DOF
under the supervision of a qualified professional who
has knowledge of the theory and methodology of
standardized assessment.
To make proper interpretations of the DOF, the
data should be scored on the DOF Profile. The
ASEBA ADM software provides instructions for
computer-scoring the DOF Profile. Interpretation of
the DOF Profile usually requires training in standard-
ized assessment commensurate with at least a
Masters degree in psychology, school psychology,
social work, special education, counseling, or a com-
parable field. Trainees, observers, and data process-
ing personnel may also use the computer software to
score the DOF Profile under the supervision of a
qualified professional. No amount of prior training,
however, can substitute for professional maturity and
a thorough knowledge of the procedures and cau-
tions presented in this Manual.
Our standards for use are consistent with the Stan-
dards for Educational and Psychological Testing
(1999) prepared and endorsed by the American Edu-
cational Research Association (AERA), American
Psychological Association (APA), and National
Council on Measurement in Education (NCME) and
with the Code of Fair Testing Practices in Educa-
tion (2004) prepared by the Joint Committee on Test-
ing Practices. Users are expected to adhere to the
ethical principles of their professional organizations,
such as the American Psychological Association and
National Association of School Psychologists.
The DOF is part of the Achenbach System of Em-
pirically Based Assessment (ASEBA). Users should
understand that ASEBA instruments are designed to
provide standardized descriptions of an individuals
functioning. The DOF should not be the sole basis
for making diagnoses or other important decisions
about children and adolescents. No scores on the
DOF scales should be automatically equated with a
particular diagnosis or disorder. Instead, the respon-
sible user will compare data obtained from the DOF
with data from other sources, such as parent reports,
teacher reports, child interviews, and observations
during test sessions.
iv
Preface
The Direct Observation Form (DOF) is part of
the Achenbach System of Empirically Based As-
sessment (ASEBA). This Manual provides basic
information needed for understanding and using
the DOF. It also provides instructions for complet-
ing and scoring the DOF and guidelines for train-
ing DOF observers, plus information on develop-
ment of the DOF, research on reliability and valid-
ity, and practical applications with case illustra-
tions of how to integrate DOF results with other
assessment information. The DOF can be used to
rate and score multiple 10-minute observations of
childrens behavioral and emotional problems in
school classrooms, at recess, and in other group
settings. The DOF includes 89 problem items to
be rated on a 4-point scale, plus on-task ratings for
each 10-minute observation session. The DOF Pro-
file comprises empirically based scales and DSM-
oriented scales normed separately for classroom
and recess observations for boys and girls ages 6
to 11.
In developing the DOF over more than 20 years,
we have benefited from the help and advice of
many colleagues. For their assistance with this
Manual, we are particularly grateful to Janet
Arnold, Rachel Brub, Sarah Cochran, Levent
Dumenci, Anne Ellis, Patricia Fletcher, Masha
Ivanova, David Jacobowitz, Ramani Sunderaju,
and Dan Walter. We are also grateful to the many
people who assisted in our data collection and data
management, including Lori Turner at the Univer-
sity of Vermont Research Center for Children,
Youth, and Families (RCCYF); Ricardo Eiraldi,
Thomas Power, and the staff of the Childrens
Hospital of Philadelphia; Kevin Antshel, Michael
Gordon, and the staff of the Department of Psy-
chiatry at SUNY Upstate Medical University; and
Robert Volpe of Northeastern University, who
served as a Postdoctoral Fellow at the RCCYF. We
are also grateful to the many psychology and school
psychology graduate students who acted as observ-
ers, as well as the children, families, and school
staff who cooperated in our research. We have ap-
preciated the advice of our colleagues James
Hudziak, Cynthia LaRiviere, Leslie Rescorla,
James Tallmadge, and Robert Volpe regarding our
observational procedures. We are also grateful to
the University of Vermont Research Center for
Children, Youth, and Families, Spencer Founda-
tion, W. T. Grant Foundation, National Institute of
Child Health and Human Development, National
Institute on Disability and Rehabilitation Research
(U.S. Department of Education), and National In-
stitute of Mental Health for support of research that
has contributed to this effort.
v
I. Introductory Material Needed by Most Readers
A. Introduction and Rationale for the Direct Observation Form (DOF) ........... Chapter 1
B. Using the DOF and Rating the DOF Items .................................................. Chapter 2
C. Computer-Scored DOF Profile .................................................................... Chapter 3
D. Training DOF Observers and Conducting School Observations ................. Chapter 4
E. Practical Applications and Case Examples .................................................. Chapter 5
II. Constructing the DOF and DOF Profile.......................................................... Chapter 6
III. Statistical Data on Reliability and Validity
A. Reliability of the DOF.................................................................................. Chapter 7
B. Validity of the DOF ..................................................................................... Chapter 8
IV. Answers to Frequently Asked Questions ......................................................... Chapter 9
V. Mean DOF Scale Scores for Normative Samples of
Boys & Girls Ages 6-11 .................................................................................. Appendix A
VI. Mean DOF Scale Scores for Match Referred Children
and Nonreferred Controls Boys 6-11 and Girls 6-11 .................................. Appendix B
VII. Pearson Correlations Among Raw Scores for DOF Scales......................... Appendix C
VIII. Items Comprising the 2009 DOF and the 1986 DOF................................... Appendix D
Readers Guide
vi
Contents
1. Introduction and Rationale for the Direct Observation Form (DOF) ....................................................... 1
ADVANTAGES OF DIRECT OBSERVATIONS........................................................................................... 1
MULTIAXIAL ASSESSMENT...................................................................................................................... 2
STRUCTURE OF THIS MANUAL............................................................................................................... 3
SUMMARY .................................................................................................................................................... 4
2. Using the DOF and Rating the DOF Items ................................................................................................... 5
COMPLETING PAGE 1 INFORMATION..................................................................................................... 5
WRITING OBSERVATION NOTES ............................................................................................................ 12
RATING ON-TASK BEHAVIOR ................................................................................................................. 12
RATING DOF PROBLEM ITEMS .............................................................................................................. 13
GUIDELINES FOR RATING SPECIFIC DOF PROBLEM ITEMS ........................................................... 15
SUMMARY .................................................................................................................................................. 22
3. Computer-Scored DOF Profile..................................................................................................................... 23
DOF PROFILE FOR CLASSROOM OBSERVATIONS ............................................................................. 23
DOF PROFILE FOR RECESS OBSERVATIONS ....................................................................................... 33
SUMMARY .................................................................................................................................................. 38
4. Training DOF Observers and Conducting School Observations .............................................................. 41
TRAINING DOF OBSERVERS................................................................................................................... 41
GUIDELINES FOR OBSERVATIONS IN SCHOOLS ................................................................................ 42
ASSESSING INTER-OBSERVER AGREEMENT ..................................................................................... 46
ASSESSING INTER-RATER RELIABILITY ............................................................................................. 51
SUMMARY .................................................................................................................................................. 55
5. Practical Applications and Case Examples ................................................................................................. 56
SEQUENCE FOR USING THE DOF AND OTHER ASEBA FORMS ...................................................... 56
SCHOOL-BASED ASSESSMENTS............................................................................................................ 59
ASSESSMENT OF ADHD .......................................................................................................................... 61
ASSESSMENT OF EMOTIONAL DISTURBANCE ................................................................................. 61
ASSESSMENT OF LEARNING DISABILITIES ....................................................................................... 62
CASE EXAMPLE OF ASSESSMENT OF ADHD...................................................................................... 65
CASE EXAMPLE OF A SCHOOL-BASED ASSESSMENT OF BEHAVIOR PROBLEMS .................... 67
SUMMARY.................................................................................................................................................. 70
6. Constructing the DOF and DOF Profile...................................................................................................... 71
EARLIER VERSIONS OF THE DOF.......................................................................................................... 71
PSYCHOMETRIC APPROACH TO THE 2009 DOF ................................................................................. 73
STATISTICAL DERIVATION OF DOF SYNDROMES FOR CLASSROOM OBSERVATIONS............. 73
LOW FREQUENCY ITEMS RETAINED ON THE DOF........................................................................... 79
AGGRESSIVE BEHAVIOR SYNDROME FOR RECESS OBSERVATIONS ........................................... 79
DSM-ORIENTED ATTENTION DEFICIT/HYPERACTIVITY PROBLEMS AND INATTENTION
AND HYPERACTIVITY-IMPULSIVITY SUBSCALES ........................................................................ 81
NORMATIVE SAMPLE .............................................................................................................................. 82
Contents vii
ASSIGNING NORMALIZED T SCORES TO RAW SCORES................................................................... 82
MEAN T SCORES........................................................................................................................................ 88
NORMAL, BORDERLINE, AND CLINICAL RANGES ........................................................................... 88
SUMMARY .................................................................................................................................................. 89
7. Reliability of the DOF ................................................................................................................................... 91
INTER-RATER RELIABILITY ................................................................................................................... 91
TEST-RETEST RELIABILITY.................................................................................................................... 93
INTERNAL CONSISTENCY ...................................................................................................................... 94
SUMMARY .................................................................................................................................................. 96
8. Validity of the DOF ....................................................................................................................................... 97
CONTENT VALIDITY OF DOF ITEMS..................................................................................................... 97
CRITERION-RELATED VALIDITY........................................................................................................... 98
SUMMARY ................................................................................................................................................ 107
9. Answers to Frequently Asked Questions ................................................................................................... 108
FEATURES OF THE DOF ......................................................................................................................... 108
APPLICATIONS OF THE DOF .................................................................................................................. 111
RELATIONS TO OTHER ASSESSMENT PROCEDURES ...................................................................... 111
RELATIONS TO DSM AND SPECIAL EDUCATION CLASSIFICATIONS ........................................... 112
References ......................................................................................................................................................... 114
APPENDIX A: Mean DOF Scale Scores for Normative Samples ................................................................ 118
APPENDIX B: Mean DOF Scale Scores for Matched Referred Children and Nonreferred Controls Boys
6-11 .............................................................................................................................................................. 119
APPENDIX B: Mean DOF Scale Scores for Matched Referred Children and Nonreferred Controls Girls
6-11 ............................................................................................................................................................. 120
APPENDIX C: Pearson Correlations among Raw Scores for DOF Scales for Class-
room Observations .................................................................................................................................... 121
APPENDIX D: Items Comprising the 2009 DOF and the 1986 DOF ........................................................ 122
Index ................................................................................................................................................................. 125
1
The Direct Observation Form (DOF) is a stan-
dardized form for rating observations of childrens
behavior in classrooms, at recess, and in other
group settings. During a 10-minute period, the
observer writes a narrative description of the childs
behavior, affect, and interactions in space provided
on the DOF. The observer also rates the child for
being on-task or off-task for 5 seconds at the end
of each 1-minute interval. At the end of the 10-
minute observation, the observer rates the child on
88 specific problem items using a 0-1-2-3 scale.
Item 89 is open-ended for rating other problems
not covered by items 1 through 88.
Because childrens behavior may vary consid-
erably from one occasion to another, the DOF com-
puter-scoring program requires at least two obser-
vations of the target or identified child. When-
ever possible, we recommend 3 to 6 separate ob-
servations of behavior on at least two different days.
We also recommend obtaining separate observa-
tions in the morning and afternoon. Observers
should complete one DOF for each 10-minute ob-
servation. The DOF computer-scoring program will
then average ratings across observation sessions.
Because the significance of a childs behavior
depends partly on how it may deviate from the be-
havior of other children, we recommend observ-
ing one or two control children in the same set-
ting as the identified child. The control children
should be the same age and gender of the identi-
fied child, but should be located as far as possible
from the identified child in the group setting. Ob-
servers do not need to know the names of the con-
trol children. Chapter 2 provides more detailed in-
structions for observing identified and control chil-
dren.
The DOF Profile provides raw scores, T scores
and percentiles for five syndrome scales derived
from factor analyses of classroom observations,
plus a DSM-oriented Attention Deficit/Hyperac-
tivity Problems scale with Inattention and Hyper-
activity-Impulsivity subscales, and a Total Prob-
lems score. The DSM-oriented scale and subscales
include DOF problem items consistent with
symptom criteria for Attention Deficit/Hyperac-
tivity Disorder (ADHD), as defined in the Diag-
nostic and Statistical Manual of Mental Disorders-
Fourth Edition and Fourth Edition-Text Revision
(DSM-IV; DSM-IV-TR; American Psychiatric As-
sociation, 1994, 2000). The DOF also has an Ag-
gressive Behavior syndrome scale for scoring ob-
servations during recess and in other non-classroom
settings. The DOF Profile has separate norms for
boys and girls ages 6 to 11. Because of the com-
plexity of averaging scores across multiple obser-
vation sessions, the DOF scales can only be scored
by computer. The DOF computer-scoring program
also provides raw scores for each of the 89 prob-
lem items and a narrative report that summarizes a
childs scores on each of the DOF scales.
ADVANTAGES OF DIRECT
OBSERVATIONS
Direct observation of childrens behavior is a
classic assessment method used by clinical and
school psychologists (Sattler & Hoge, 2006;
Shapiro & Heick, 2004; Wilson & Reschly, 1996).
Numerous coding systems have been developed
for scoring direct observations of childrens behav-
ior in classrooms (Volpe, DiPerna, Hintze, &
Shapiro, 2005) and playground settings (Leff &
Lakin, 2005). Systematic direct observations share
the following characteristics: (a) their goal is to
measure specific target behaviors; (b) the target be-
haviors are defined in a manner that makes them
readily observable with a minimum of inference;
(c) the observations are conducted according to
standardized procedures; (d) the times and places
for observations are specified; and (e) the obser-
Chapter 1
Introduction and Rationale for the
Direct Observation Form (DOF)
1. Introduction and Rationale 2
quantified and/or summarized in a standardized
manner that does not vary from one observer to
another (Volpe et al., 2005).
Many systems for coding observations focus on
a limited set of target behaviors (e.g., academic
engaged time, out-of-seat, physical aggression,
verbal aggression) and rely on continuous record-
ing or time sampling methods. Continuous record-
ing methods count the number of times a behavior
(or event) occurs within a given period or record
the duration of time in which the behavior (or
event) was observed. Continuous recording is most
effective when behaviors have discreet beginnings
and ends, low to moderate rates of occurrence, and
are present only briefly. Time sampling records the
presence or absence of target behaviors within short
specified time intervals. Time sampling is useful
when multiple simultaneous target behaviors
hinder continuous recording or when samples of
behavior are observed across different settings.
The DOF, by contrast, is designed for rating di-
rect observations of multiple specific behaviors
over a specific interval (10 minutes). The observer
writes a narrative running log of observations over
the 10-minute period, while also rating the child
as being on-task or off-task during the last 5 sec-
onds of each 1-minute interval. At the end of the
10-minute period, the observer rates the child on
each of 89 DOF problem items. The DOF has the
following advantages: (a) it provides a structured
and efficient method for rating observations of a
broad range of specific types of problems; (b) in-
dividual problem items are grouped into empiri-
cally based syndrome scales, a DSM-oriented
ADHP scale and subscales, and Total Problems;
(c) norms provide a standard for judging the se-
verity of problems by comparing an individuals
DOF scores to large samples of nonreferred chil-
dren of the same gender and age range; and (d)
scores from DOFs for large samples can be tested
for reliability and validity as done for other stan-
dardized rating scales.
MULTIAXIAL ASSESSMENT
No one assessment method should serve as the
sole basis for evaluating childrens functioning or
for making important decisions about children. In-
stead, responsible evaluators will compare data ob-
tained from one source or method with data ob-
tained from other sources. We use the term, mul-
tiaxial assessment to describe the process of gath-
ering and integrating information across multiple
data sources.
To facilitate multiaxial assessment, we designed
the DOF as a component of the Achenbach Sys-
tem of Empirically Based Assessment (ASEBA).
The ASEBA comprises an integrated set of rating
forms for assessing competencies, adaptive func-
tioning, and problems in easy and cost-effective
ways. The ASEBA forms most relevant for use with
the DOF are the Child Behavior Checklist for Ages
6 to 18 (CBCL/6-18; Achenbach & Rescorla,
2001), Teachers Report Form (TRF; Achenbach
& Rescorla, 2001), Youth Self-Report (YSR;
Achenbach & Rescorla, 2001), Test Observation
Form (TOF; McConaughy & Achenbach, 2004),
and the Semistructured Clinical Interview for Chil-
dren and Adolescents (SCICA; McConaughy &
Achenbach, 2001). The ASEBA also includes
forms for children 1 to 5, adults ages 18 to 59,
and older adults ages 60 to 90. ASEBA data for
ages 6 to 11 can be integrated with standardized
test data, medical data, developmental history, and
other information obtained from records and in-
terviews, as outlined in Table 1-1. The multiaxial
assessment model includes the following five axes:
Axis I. Parent Data. Standardized ratings of
childrens competencies and problems by par-
ents, using the CBCL/6-18, plus history of the
childs development, problems, competencies, and
interests as reported by parents in interviews and
questionnaires.
Axis II. Teacher Data. Standardized ratings of
the childs school performance and problems by
teachers, using the TRF, plus history of the childs
school performance as reported by teachers on re-
port cards, comments in school records, and inter-
views.
Axis III. Cognitive Assessment. Ability tests,
such as the Cognitive Assessment System (CAS;
Naglieri & Das, 1997), Stanford-Binet Intelligence
Scales-Fifth Edition (SB5; Roid, 2003), Wechsler
Intelligence Scale for Children-Fourth Edition
(WISC-IV, Wechsler, 2003), Woodcock-Johnson III
Tests of Cognitive Abilities (WJ III COG; Wood-
cock, McGrew, & Mather, 2001), and Kaufman As-
sessment Battery for Children (KABC; Kaufman
& Kauf- man, 1983); achievement tests; tests of
perceptual-motor skills; and speech and language
tests. The TOF can also be used by test examiners
to obtain standardized ratings of the childs test
session behavior.
Axis IV. Physical Assessment. Height and
weight, physical and/or neurological abnormali-
ties and disabilities, medical and medication his-
tory.
Axis V. Direct Assessment of the Child. Direct
observations in group settings, using the DOF; clini-
cal interviews, using the SCICA; standardized self-
ratings by 11-year-olds, using the YSR; self-con-
cept measures, personality tests, and other mea-
sures for assessing behavioral and emotional func-
tioning.
The model in Table 1-1 provides guidelines for
multiaxial assessment of 6-to -11-year-old children.
However, not all sources of data may be relevant
or available for every child. For example, self-rat-
ings may not be useful for children younger than
age 11 and children who cannot reflect on their
own behavior. Parents reports are highly relevant,
but may not be available from both parents if the
child lives with only one parent or a parent surro-
gate. Teachers reports are usually relevant for
school children if one or more teachers are avail-
able to provide them. Standardized ratings of be-
havioral and emotional characteristics observed
during testing can add important information about
a childs reactions to structured assessment and can
help examiners judge the validity of test scores.
Table 1-1
Examples of Multiaxial Assessment Procedures for Ages 6 to 11
Axis I Axis II Axis III Axis IV Axis V
Parent Teacher Cognitive Physical Direct Assessment
Report Report Assessment Assessment of Child
CBCL/6-18
a
TRF
b
TOF
c
Height, weight DOF
d
History School records Ability tests Medical exam SCICA
e
Parent Caregiver Achievement tests Neurological YSR
f
(for age 11)
interview interview exam
Perceptual-motor tests Self-concept mea-
sures
Language tests Personality tests
a
CBCL/6-18 = Child Behavior Checklist/6-18 (Achenbach & Rescorla, 2001).
b
TRF = Teachers Report Form (Achenbach & Rescorla, 2001).
c
TOF = Test Observation Form (McConaughy & Achenbach, 2004).
d
DOF = Direct Observation Form (McConaughy & Achenbach, 2009).
e
SCICA = Semistructured Clinical Interview for Children and Adolescents (McConaughy & Achenbach,
2001).
f
YSR = Youth Self-Report (Achenbach & Rescorla, 2001).
Direct observations in classrooms or other group
settings can be compared with parent and teacher
reports and with test session observations. All rel-
evant information from the five axes should be in-
tegrated into cohesive formulations of childrens
cognitive and behavioral/emotional functioning in
order to meet their needs.
STRUCTURE OF THIS MANUAL
This Manual provides information for using and
scoring the DOF, plus details of its development,
standardization, and psychometric properties. User
qualifications are presented on Page iii. In this
chapter, we discussed our rationale for developing
the DOF within the context of a multiaxial assess-
ment model. Chapter 2 discusses how to use the
DOF, including how to record observations and rate
the DOF problem items. Chapter 3 describes the
computer-scored DOF Profile and its narrative re-
port. Chapter 4 provides guidelines for conduct-
ing school observations and training DOF observ-
ers. Chapter 5 discusses practical applications of
the DOF for use in schools and mental health as-
sessments. Case examples illustrate how the DOF
can be used to assess childrens problems and how
to integrate DOF results with data from other
sources.
The remaining chapters present technical de-
tails of our research on the DOF and the DOF Pro-
file. Chapter 6 provides background on earlier ver-
sions of the DOF, development of the DOF item
set, statistical analyses to derive the five DOF syn-
drome scales for classroom observations and the
Aggressive Behavior syndrome scale for recess
observations, development of the DOF DSM-ori-
ented Attention Deficit/Hyperactivity Problems
scale and its Inattention and Hyperactivity-Impul-
sivity subscales, assignment of T scores to raw
scores, and borderline and clinical cutpoints for
the DOF problem scales and On-task. Chapter 7
presents data on reliability, while Chapter 8 pre-
sents data on validity of the DOF. Chapter 9 an-
swers frequently asked questions about the DOF
and the general approach we have used to develop
the DOF and its scoring profile.
Appendix A presents mean T scores and raw
scores, standard deviations, and standard errors for
DOF scale scores for the normative sample. Ap-
pendix B presents mean T scores, raw scores, and
standard deviations for matched samples of referred
children and nonreferred controls. Appendix C dis-
plays correlations among the DOF scale scores. Ap-
pendix D shows the 89 items on the 2009 version
of the DOF compared to the 97 items of the 1986
DOF.
SUMMARY
We designed the DOF as a standardized form
for rating direct observations of childrens behav-
ior in classrooms and other group settings. The
DOF Profile for classroom observations displays
five empirically based syndrome scales, a DSM-
oriented Attention Deficit/Hyperactivity Problems
scale and Inattention and Hyperactivity-Impulsiv-
ity subscales, plus Total Problems scores. The DOF
Profile for recess observations has an empirically
based Aggressive Behavior syndrome scale and
Total Problems score. The DOF scales are scored
on norms for boys and girls ages 6 to 11. As part of
the ASEBA, the DOF provides data that can be
easily compared to data obtained from parents,
teachers, youths self-ratings, test session obser-
vations, and observations during child clinical in-
terviews.
5
Chapter 2
Using the DOF and Rating the DOF Items
The 2009 edition of the DOF is a revision of
the 1986 version, as explained in detail in Chapter
6. As shown in Figure 2-1, the first page of the
DOF includes spaces to write demographic infor-
mation about the identified child and control chil-
dren, the date and time of observations, and infor-
mation about the observer and setting. Page 1 also
provides brief instructions for writing notes, rat-
ing On-task, and rating the DOF problem items
that are discussed in detail in this chapter. Page 2
provides space for writing observation notes and
rating the childs on-task and off-task behavior at
the end of each 1-minute interval. Page 3 lists the
89 DOF problem items to be rated at the end of
each 10-minute observation. Page 4 provides
instructions for completing information at the top
of Page 1.
Observers should complete one DOF for each
10-minute observation. As indicated in Chapter 1,
the DOF computer-scoring program requires at
least two observations of the identified child.
Whenever possible, we recommend obtaining 3 to
6 separate 10-minute observations of the identi-
fied child on at least two different days. To obtain
a stable index of behavior, observations should all
be conducted within a one- to two-week time
frame. We also recommend obtaining separate ob-
servations in the morning and afternoon across dif-
ferent days to provide a broad sampling of the
childs behavior. Some observers may choose to
obtain several sets of observations over longer time
frames for purposes such as progress monitoring,
assessing the stability of observed behaviors, and
evaluating outcomes of interventions.
Because the significance of a childs problem
behavior depends partly on its deviance from the
behavior of other children in similar contexts, we
recommend that separate DOFs also be completed
for one or two control children in the same set-
ting as the identified child. Observers should ran-
domly select control children who are of the same
gender and age as the identified child, but who are
located far enough away so as not to be interacting
with the identified child. Observers should com-
plete one DOF for each 10-minute observation of
each control child, as done for the identified child.
Ideally, observers should complete one DOF for a
control child observed just before the identified
child and one DOF for a second control child ob-
served just after the identified child. We recom-
mend obtaining at least two separate 10-minute ob-
servations for each control child. Page 1 of the DOF
provides boxes for indicating whether each 10-
minute observation was done for the Identified
Child, Control Child 1, or Control Child 2.
Chapter 4 discusses procedures for selecting con-
trol children.
The DOF computer-scoring program, described
in Chapter 3, automatically averages the observers
ratings for a minimum of two and a maximum of
six separate DOFs for the identified child for
each set of observations. It also separately aver-
ages ratings for 2 to 6 DOFs for each of the two
control children. The computer-scored DOF Pro-
file displays mean raw scores and corresponding T
scores and percentiles for each DOF scale for the
identified child, along with mean raw scores, T
scores, and percentiles for ratings averaged across
two control children. Chapter 3 provides details of
the computer-scored DOF Profile.
COMPLETING PAGE 1
INFORMATION
On Page 1 of the DOF (see Figure 2-1), observ-
ers record demographic information about the iden-
2. Using the DOF and Rating the DOF Items 6
Figure 2-1. Page 1 of the Direct Observation Form.
2. Using the DOF and Rating the DOF Items
7
Figure 2-1 (cont.) Page 2 of the Direct Observation Form.
9
may decide not to record the identified childs full
name on the DOF until after you have left the ob-
servation setting so neither the identified child nor
peers will see the name of the child being observed.
On the DOF for control children, you can write a
brief description of the child in the space for the
identified childs name (e.g., boy with dark curly
hair; girl with blond hair in front row) to help you
identify multiple DOFs for the same control child.
Or you can use an abbreviation to link the control
child to the identified child. The descriptive infor-
mation for control children can help to answer
questions that may arise when you are trying to
identify multiple DOFs for a particular control child
linked to the identified child.
Childs Gender
Check Boy or Girl for the gender of the child
being observed. Ideally, the gender of the con-
trol child should match the gender of the identi-
fied child.
Childs Age
On DOFs for the identified child, write age in
years. On DOFs for control children, write age of
the control child if known, or write age of the iden-
tified child as an estimate of the control childs
age, or leave blank.
Administrators, coordinators, or observers
should write the age in years of the identified child.
Observers do not need to know the names and ages
of control children. On DOFs for control children,
you can write the age of the identified child as an
estimate of the control childs age or leave this
space blank.
Childs Ethnic Group or Race
Write the known or apparent ethnic group or race
of the child being observed (e.g., White, African
American, Asian).
In this space, write the known or apparent eth-
nic group or race of the child being observed (Iden-
tified Child, Control 1, Control 2). You can use
your own terminology for ethnic group or race or
choose from a list of terms. The DOF computer-
scoring program provides the following list of
tified child and the particular child being observed
(identified or control), the observer, and setting.
Instructions for completing each field are provided
on Page 4 of the DOF.
The instructions for each field at the top of Page
1 are shown here in smaller font and discussed in
more detail.
ID#
This space is for an anonymous user-created ID
number for the identified child. The ID number is
usually assigned by an administrator or other
appropriate staff member. The same ID number
should be used for control children matched to
the identified child.
The space at the top of the DOF is for a user-
defined ID number that is unique for each identi-
fied child. The same ID number should be assigned
to each control child who is linked to the identi-
fied child (Control 1, Control 2). The ID number
may be created by an administrator or other ap-
propriate staff member who is coordinating the
observations. In some cases, the observer may also
assign the ID number if the observer is acting as
an independent user (e.g., a school psychologist
using the DOF to assess a child). For computer-
scoring, the ID number will serve as key informa-
tion for linking an identified child to control chil-
dren.
Identified Childs Name
Write the first, middle (if available), and last name
of the identified child (e.g., John Eric Smith). On
the DOFs for control children matched to the
identified child, write a brief description of the
control child (e.g., boy with dark curly hair) and/
or write an abbreviation of the identified childs
name to create a link to the control child (e.g., if
the identified child is John Eric Smith, Control 1
might be labeled JES-C1).
Whenever possible, write the full name of the
identified child. Avoid using initials and writing
only the first or last name of the identified child
because more than one child may have the same
name. However, as discussed in Chapter 4, you
11
Observation Set
Assign a label to identify the set or group of DOFs
for the identified child and control children to be
computer-scored on the same DOF Profile. This
might be a time frame for the set of observations
(e.g., Fall 2009) or a specific setting for the ob-
servations (e.g., math class, library). The com-
puter-scoring program allows a minimum of 2 and
maximum of 18 DOFs as an observation set to
be scored on one DOF Profile: 2 to 6 DOFs for
the Identified Child, 1 to 6 for Control 1, and 1 to
6 for Control 2. DOFs for control children are op-
tional.
Observation set is a required field for computer-
scoring. When you enter each DOF into the com-
puter-scoring program, you must assign a label to
identify it as a member of a set of DOFs that will
be selected as one group to be scored on the same
DOF Profile. As explained in Chapter 3, the com-
puter-scoring program averages ratings on DOF
items across multiple DOFs separately for the iden-
tified child and matched control children. You can
use any label that is meaningful to you to identify
which DOFs will form a set for the averaging pro-
cess in computer-scoring. Examples are a label for
a time frame for the set of observations or a spe-
cific setting for the observations. You can use the
same observation set label for a minimum of two
and maximum of 18 DOFs for computer-scoring.
DOFs for control children are optional. There must
be at least two DOFs for control children when
observations for control children are included in
an observation set, as explained in Chapter 3.
Observed Child
Check one box to indicate whether the observed
child for each DOF is the Identified Child, Con-
trol Child 1, or Control Child 2.
This is a required field for computer-scoring.
Check the box, Identified Child, to indicate that
the observed child was the selected identified child
whose name is recorded on the DOF form. Check
the box, Control Child 1, for the first control child
in the same setting who is to be matched to the
identified child. Check the box, Control Child 2,
for the second control child in the same setting who
terms for data entry: African American, Asian,
Latino/Latina, Native American, Pacific Islander,
White (non-Latino), Other. You can also create your
own terms for this field for data entry.
Observers Name
Write the observers first and last name or
initials.
Observation #
Write a separate unique number for each 10-
minute observation for the identified child (e.g.,
1, 2, 3, 4, 5, 6) and each 10-minute observation
for each control child.
Write a separate unique number for each sepa-
rate 10-minute observation in sequence for each
individual child. For example, if you observe the
identified child six times, record observation num-
bers 1, 2, 3, 4, 5, and 6 for each of the six DOFs in
sequence for the identified child. The six observa-
tions may span the course of several days. The
observation number, Todays Date, and Time
of Day should be consistent with the sequence of
observations. If you observe one control child
(Control Child 1) twice, record observation num-
bers 1 and 2 for each DOF in sequence for that
control child. If you observe a second control child
(Control Child 2) twice, record observation num-
bers 1 and 2 for each DOF in sequence for that
child.
Grade or Level
Write the grade (e.g., Kindergarten, 1st, 4th) or
level in school (e.g.,1-2) of the child being ob-
served. Ideally, the grade or level of the control
child should match the grade or level of the iden-
tified child.
Identified Childs Birthdate
Write the identified childs birthdate.
The DOF and the DOF computer-scoring pro-
gram use month-day-year format for birthdate. On
DOFs for a control child, write the birthdate of the
identified child. In addition to the ID number and
identified childs name, the birthdate will provide
another way to link control children to the appro-
priate identified child.
is to be matched to the identified child. Whenever
possible, the gender of the control children should
be the same as the gender of the identified child.
Time of Day
Write the time of the beginning of the 10-minute
observation in hours and minutes and a.m. or
p.m. (e.g., 9:20 a.m., 12:30 p.m.)
Todays Date
Write the date of the observation.
The DOF and the DOF computer-scoring pro-
gram use month-day-year format for the date of
the observation.
Setting
Check one box to indicate whether the observa-
tion was conducted in the classroom or at re-
cess. If you conduct an observation in a setting
other than class or recess, choose the setting
option that most closely approximates the activ-
ity of children in that particular setting (e.g., lunch
= recess; small group instruction = class). You
can use the space to write the type of activity for
classroom observations (e.g., math, reading,
circle group) or recess observations (e.g., inside
games, outdoor play).
This is a required field for computer-scoring.
Choose only one setting (Class or Recess) for each
DOF. The computer-scoring program uses these
two fields to determine whether the ratings from
that DOF will be scored on a DOF Profile based
on norms for classroom observations or norms for
recess observations. There is no option for Other
setting because there are no normative data for scor-
ing observations in settings other than class or re-
cess. You also have the option of recording the type
of activity for each DOF for classroom observa-
tions (e.g., math, reading, circle group) or recess
observations (e.g., inside games, outdoor play).
WRITING OBSERVATION NOTES
Use the spaces provided on Page 2 (see Figure
2-1) to write a narrative description of the childs
behavior, affect, and interaction style over the 10-
minute observation period. You do not have to write
complete sentences. Instead, record brief notes and
abbreviations that will help you rate the 89 DOF
problem items listed on Page 3. The numbered
boxes in the left-hand column on Page 2 demar-
cate 1-minute intervals for rating on-task, as ex-
plained in the next section.
By scanning the list of DOF problem items be-
fore each observation session, you can familiarize
yourself with the types of behaviors to describe.
When appropriate, note the frequency (e.g., by chit
marks), duration (e.g., 20 sec), or intensity of spe-
cific problems to help you choose between ratings
of 1, 2, or 3 for each problem item. Sometimes,
you may want to describe events during the 10-
minute observation that affect the childs behav-
ior, such as the teachers behavior or behavior of
peers. For example, you may observe that a child
daydreams or is restless during independent seat
work in class, but does not show these problems
when the teacher works with him/her directly. Or
you may observe that a child is teased or hit by
another child, and subsequently teases back or be-
comes involved in a fight. You may consider these
interactions when rating the childs behavior on
relevant DOF problem items. However, you should
avoid making inferences about the childs motiva-
tions when rating specific DOF items, as instructed
in a later section. Remember that DOF items are
to be rated only for behavior observed in the 10-
minute window for the observation period. The 10-
minute observation window also applies to any
events that might affect the childs behavior.
RATING ON-TASK BEHAVIOR
The left-hand side of Page 2 of the DOF (see
Figure 2-1) contains 10 boxes in 2 columns for
rating whether the child is on-task (ON TASK) or
not on-task (OFF TASK). These boxes represent
5-second intervals at the end of each minute of ob-
servation. In the last 5 seconds of each 1-minute
interval, observe the childs on-task behavior. If
the childs behavior is on-task during the 5-second
interval, draw a line through the box for ON
TASK. If the child is not on-task, draw a line
13
tions.
Figure 2-2 illustrates an observers notes and
on-task ratings for the first 10-minute observation
of 8-year-old Melinda Brandt (not her real name),
whose computer-scored DOF Profile is presented
in Chapter 3. Melinda is also discussed as a case
example in Chapter 5. The complete set of obser-
vations included four 10-minute observations of
Melinda and two 10-minute observations of each
of two control children in the same class.
RATING DOF PROBLEM ITEMS
Immediately after completing each 10-minute
observation, rate the child on the 89 DOF problem
items listed on Page 3. Be sure to complete your
ratings of DOF problem items before you start an-
other 10-minute observation. Problem behaviors
do not have to attract the attention of the school
staff in order to be rated as present. Equally im-
portant, your ratings of problem items should not
depend on your ratings of whether the child was
on-task or off-task. For example, a child may
be considered on-task while working on an assign-
ment, but still be restless, or fidget, or look un-
happy. Some problems, such as 7. Doesnt concen-
trate or pay attention for long, can suggest the child
is off-task. However, it is possible that a child could
have problems concentrating during parts of the
observation period, but then be on-task during the
last 5 seconds of a 1-minute interval.
To rate the DOF problem items, choose the one
item that specifically reflects each behavior actu-
ally observed during the 10-minute observation
period. Review your notes on Page 2 to help re-
member your observations. As you read the DOF
problem items, you may also remember some be-
haviors that may not have been described in your
notes. You can rate such items even if you did not
write the specific behavior in your notes. (As you
become more familiar with the DOF problem
items, your observation notes should become more
closely aligned to your item ratings.) You may also
consider interactions with teachers and peers dur-
ing the 10-minute observation period to rate spe-
through the box for OFF TASK.
Consider a child to be on-task if he/she is doing
what is expected in that situation (e.g., listening to
directions, reading a book, working on an assigned
task at his/her desk, listening to others in circle
time, etc.). The child should be on-task for the ma-
jority of the 5-second interval. You can use a stop-
watch to indicate each 1-minute interval if you
wish, but this is not required. Another option is to
watch the second hand on a clock or your watch
and start each 1-minute on-task observation at a
specified time (e.g., when the second hand is on
the 11).
If the child is not on-task for the majority of the
5-second interval, rate the child as off- task. The
following are examples of when a child is off-
task:
The child does something that requires the
teacher to redirect him/her to get back on-task.
The child is doodling or drawing or playing with
a toy or other object when he/she is supposed
to be listening to the teacher or working on an
assignment.
The child is looking around the room or is not
looking at the teacher or someone else who is
speaking to him/her or to the whole class.
The child is poking another student, talking to
another student, or clowning when he/she is sup-
posed to be listening or working quietly.
At the end of the 10-minute observation period,
count the number of intervals you rated the child
as off-task and write the sum in the box for SUM
OFF TASK. Count the number of intervals you
rated the child as on-task and write the sum in the
box for SUM ON TASK. The total number of in-
tervals rated for SUM OFF TASK + SUM ON
TASK should not exceed 10. The computer-scor-
ing program averages on-task ratings across mul-
tiple DOFs separately for the identified child and
for controls. The total number of intervals for on-
task and off-task on a single DOF must be > 8 for
computer-scoring. On-task ratings are only scored
for classroom observations, not recess observa-
Figure 2-2. Observers notes and on-task ratings for the first 10-minute observation of Melinda
Brandt.
15
sec). These notes will help you judge the frequency
or intensity of the behavior for rating an item 1, 2,
or 3. Other problems (e.g., 11. Confused or seems
to be in a fog; 16. Difficulty following directions)
will require your judgment for rating frequency or
intensity.
Be sure to rate only the one DOF item that most
specifically describes a particular observation. For
example, several items describe attention problems
or hyperactivity, such as 7. Doesnt concentrate or
doesnt pay attention for long; 9. Doesnt sit still,
restless, or hyperactive; 13. Fidgets, including with
objects; 56. Easily distracted by external stimuli;
and 57. Stares blankly. If a child exhibits any such
problems during the 10-minute observation period,
rate the one item that best fits the actual behavior
observed. You may rate more than one item only if
the child exhibits more than one different kind of
problem, such as difficulty concentrating at cer-
tain times, being easily distracted at other times,
and being restless. Avoid rating more than one item
for the same observation. Figure 2-3 shows the
observers ratings of Melinda Brandt based on
notes for the same 10-minute observation period
depicted in Figure 2-3.
GUIDELINES FOR RATING SPECIFIC
DOF PROBLEM ITEMS
This section provides guidelines to help you
choose and rate specific DOF problem items based
on our research to develop the DOF. (We have not
found it necessary to give guidelines for every
item.) You can refer to these guidelines when ques-
tions arise during rating. Several guidelines are in-
tended to help you differentiate between similar
items. It is not necessary to memorize the guide-
lines for rating the DOF items. However, you
should have the guidelines available when you do
your ratings.
1. Acts too young for age. Rate for a child who
acts too young or seems immature for his/her chro-
nological age or has mannerisms of a younger child,
cific problem items (e.g., 17. Tries to get attention
of staff or 31. Gets teased).
Rate the child on each DOF problem item ac-
cording to the following instructions written at the
top of Page 3:
For each item that describes the child during the
10-minute observation period, circle:
0 = no occurrence;
1 = very slight or ambiguous occurrence;
2 = definite occurrence with mild to moderate in-
tensity/frequency and less than 3 minutes total
duration;
3 = definite occurrence with severe intensity, high
frequency, or 3 or more minutes total duration.
The intensity of the observed problem and the
3-minute duration are guidelines for choosing rat-
ings of 1, 2, or 3. If it is unclear whether a particu-
lar problem occurred or if there was only a slight
occurrence, rate the relevant item 1. If a particular
problem definitely occurred with mild to moder-
ate intensity or frequency and less than 3 minutes
total duration over the course of the 10-minute
observation period, rate the relevant item 2. Rate
an item 3 if a particular problem occurred with
severe intensity, or occurred for 3 or more minutes
over the 10-minute observation period, or occurred
intermittently for a total of 3 or more minutes
throughout the 10-minute observation period. It is
not necessary to actually time your observations
of each problem. However, it is helpful to have a
clock in view so that you can judge whether a prob-
lem occurred for at least 3 minutes versus less than
3 minutes. For certain easily observed discreet be-
haviors (e.g., fidgets, restless, makes odd noises,
interrupts), you can make a note each time you
observe the behavior to help you judge the fre-
quency of that behavior. Or you can write chit
marks next to the initial note of the behavior for
each time it occurred during the 10-minute period
(e.g., fidgets ////). For certain other discreet behav-
iors, you can record the amount of time for each
instance of their occurrence (e.g., out of seat, 30
Figure 2-3. Observers ratings based on notes for the same observation of Melinda Brandt as in
Figure 2-2.
17
stimuli, such as noises or activity in the environ-
ment. The same child could be rated for both items
if he/she fails to concentrate or pay attention at
certain times and is also easily distracted by spe-
cific stimuli at other times. Also rate item 7 when
a child has difficulty returning to a task or when
there is no recovery of attention back to the origi-
nal task once attention has wandered.
8. Difficulty waiting turn in activities or tasks.
Rate when a child has trouble waiting for his/her
turn in group activities or in class discussions.
Examples are talking out of turn, cutting in line, or
grabbing materials from another child when he/
she is supposed to wait for a turn. Rate item 32 for
children who interrupt the teacher or other chil-
dren who are talking. Rate item 33 for children
who call out in class when they are expected to
remain quiet or raise their hand before talking.
9. Doesnt sit still, restless, or hyperactive. Rate
for behaviors such as squirming in seat, frequently
changing position, swinging feet, or draping body
across seat. Rate item 13 for fidgeting and item
33 for more general impulsive behavior. Rate only
item 28 for out of seat behavior that is not due to
restlessness. If the child is restless in his/her seat
and gets out of seat to walk around the room, then
both items 9 and 28 may be rated.
11. Confused or seems to be in a fog. Rate for
behaviors that suggest confused thinking or gen-
eral confusion about tasks or conversation. Rate
item 77 for difficulty expressing self clearly. Rate
item 16 for difficulty following directions.
12. Cries. Rate when a child looks tearful or
actually sheds tears. Rate 1 for slight or ambigu-
ous tearfulness or crying, such as looking like about
to cry with watery eyes. Rate 2 or 3 for obvious
crying.
13. Fidgets. Rate for non-purposeful activity
with hands that includes an object or non-purpose-
ful finger play. Examples are twirling hair, twirl-
ing glasses, tapping pencils, picking at paper edges,
and twisting the sleeve of a shirt, or tapping fin-
gers together or playing with fingers. Rate item 38
such as baby talk or acting like a baby, or making
gestures typical of a younger child. Rate item 52
for showing off, clowning, or acting silly.
2. Makes odd noises. Rate for humming, click-
ing, grunting, whistling, muttering, or singing to
self, when these noises are not part of specified
activity, such as a song or imitation of animals.
This item can be rated even if the noises seem to
indicate a happy state in the child. Rate this item
for vocal tics.
3. Argues. Rate when a child argues with an
adult or peer about something, such as requirements
for an assignment, or rules of a task. Rate item 5 if
the child sasses, talks back, or is defiant toward a
teacher or staff member.
4. Cheats. Rate for cheating in academic tasks
or games. Examples are copying another childs
answers on assignments or tests when this is not
part of a cooperative group activity or breaking
rules of a game in order to win or get ahead.
5. Defiant or talks back to staff. Rate for sassing
or talking back to teacher or other school staff (e.g.,
saying This is stupid, I dont want to do
Try and make me). If the child sasses or talks
back, then refuses to do something that the teacher
has asked him/her to do, you can also rate item 20
for being disobedient.
6. Brags, boasts. Rate for bragging or boasting
about accomplishments, skills, appearance, or pos-
sessions. An example is a child who says he/she is
the smartest kid in the school or the toughest kid
on the playground or a child who says he/she is
better than anyone else in a skill or in appearance.
Do not rate if the child is giving a self appraisal in
response to a specific question about his/her per-
formance on a task, activity or skill.
7. Doesnt concentrate or doesnt pay atten-
tion for long. Rate for problems with concentra-
tion or short attention span, or intermittent lapses
in attention. Item 7 should be used to rate behavior
that does not involve responses to particular dis-
tracting stimuli, whereas item 56 should be used
to rate a childs distraction by specific observable
for hand wringing or other nervous movements
with hands or fingers.
15. Daydreams or gets lost in thoughts. Rate if
a child appears to be daydreaming, such as gazing
out the window at nothing or looking off into space.
Rate item 7 when the child doesnt pay attention
to instruction, lessons, or directions, or does not
concentrate on work. Rate item 57 for blank star-
ing or blankness of expression.
16. Difficulty following directions. Rate for a
child who appears to have difficulty carrying out
instructions or who needs clarification or repeti-
tion of instructions. Rate also for a child who needs
directions or instructions simplified or rephrased
in a different way or who needs demonstrations
for carrying out tasks.
17. Tries to get attention of staff. Rate for de-
liberate attempts to get attention of the teacher or
other staff in the room or area, such as raising or
waving hand a lot, going over to teachers desk, or
asking for help. Rate also for attempts to get the
observers attention if a child continues after an
explanation that observer cannot interact with child.
Rate item 52 for clowning or making faces. Both
items 17 and 52 can be rated if a child clowns or
acts silly and at times directs the clowning toward
staff for attention. Do not rate for raising hand in
response to a teachers direction to raise hands or
raising hand to answer a teachers question.
18. Destroys own things. Rate for destroying
own things, such as ripping paper or drawings,
breaking pencils, breaking toys, or ripping clothes.
Rate item 19 for destroying other peoples things.
20. Disobedient. Rate for acts of disobedience,
such as breaking school rules, or for behaviors that
result in punishment for violations of rules, such
as getting time-outs, getting detentions, or being
sent to the principals office. Also rate when a child
refuses to comply with a teachers or other staff
members request or directive, or when a child is
reprimanded but continues to do the behavior that
led to the reprimand.
21. Disturbs other children. Rate for bother-
ing or disturbing another child by talking or some
activity. Rate item 46 for disrupting the activities
of a group of children. Rate item 32 for interrupt-
ing or butting into an ongoing conversation or ac-
tivity of adults or peers.
23. Doesnt seem to listen to what is being said.
Rate for a child who appears not to be listening to
a teachers instructions or directions or who does
not listen to other children when expected to lis-
ten, such as in circle time or class discussions. Rate
item 7 when a child doesnt concentrate or pay at-
tention to his/her work or other activities when at-
tention would be expected.
24. Eats, drinks, chews, or mouths things that
are not food. Rate only for nonfood items such as
paper, dirt, sand, or crayons, string, parts of cloth-
ing, and some body parts, such as hair. Do not rate
for chewing gum, sweets, soda, or junk foods. Rate
item 42 for picking or scratching nose, skin or other
parts of body. Rate item 76 for sucking thumb, fin-
gers, hand or arm.
25. Difficulty organizing activities or tasks.
Rate when a child seems disorganized in his/her
approach to assignments or other activities. Ex-
amples are when a child has difficulty finding the
right page in the book, has difficulty arranging ma-
terials for a project, or whose desk is cluttered or
messy while working on assignments.
26. Fails to give close attention to details. Rate
for a child who overlooks details in completing
tasks. Examples are skipping parts of an assign-
ment or failing to notice plus or minus signs for
numerical operations in math problems.
27. Forgetful in activities or tasks. Rate for a
child who forgets materials or routines or who for-
gets information that he/she would be expected to
remember. Examples are forgetting to bring pen-
cils, papers, or books for working on assignments
or a child who forgets to do expected routines, such
as standing in line to go outside.
28. Out of seat. Rate when a child is out of his/
her seat during times when he/she should remain
19
seated. Out of seat is when the childs bottom is
off the seat or the childs body weight is not sup-
ported by the chair (e.g., the child is just resting
one leg on the chair). Do not rate for getting out of
seat to change activities or to respond to a teachers
request, such as joining circle time or moving to a
new section of the room for an activity. Do not
rate when being out of seat is required for an ac-
tivity. Do rate for getting up to sharpen pencils,
get materials, or to talk to other students, unless
the child was specifically directed to do so. Do not
rate item 28 for recess observations unless chil-
dren are expected to be seated for an activity.
30. Gets into physical fights. Rate for physi-
cally fighting with peers, or adults, including hit-
ting, punching, pushing, scratching, kicking, etc.
Do not rate for physical play that is part of a game
unless the physical play seems excessive. Rate item
41 if a child initiates a physical attack on another
person. Both items 41 and 30 may be rated if a
child initiates a physical attack on someone that
then progresses into an ongoing physical fight. Item
30 may also be rated when a child gets into a physi-
cal fight that was provoked, e.g., by name-calling
or teasing.
32. Interrupts. Rate for a child who interrupts
or butts into an ongoing conversation or who in-
terrupts the teacher or other children while they
are talking. An example is a child who starts talk-
ing about something when a teacher is giving in-
structions or a child who asks a question before
the teacher or peer has finished saying something.
Rate item 33 for a child who calls out in class when
expected to remain quiet or expected to raise his/
her hand before talking. Rate item 21 for a child
who physically disturbs other childrens activities.
33. Impulsive or acts without thinking, includ-
ing calling out in class. Rate for immediate ac-
tions or responses that seem impulsive, such as
grabbing things or shifting from one action to an-
other, calling out answers without raising hand, or
careless or hurried approach to a specific task. Rate
item 32 when a child interrupts a conversation or
interrupts while the teacher is talking or giving in-
structions.
34. Physically isolates self from others. Rate
for a child who physically isolates self from others
or a group, such as sitting alone in the corner of a
room. Rate item 75 for a child who generally ap-
pears uninvolved, distant, or does not interact with
peers or staff, or who appears uninvolved off and
on throughout the observation period.
37. Nervous, highstrung, or tense. Rate for ner-
vous, jumpy, overdriven, or uptight behavior or
demeanor or a general feeling of nervous tension
from a child. Rate item 9 for a child who fails to
sit still, is restless, or is overactive. Rate item 13
for a child who fidgets with objects. Rate item 38
for more specific nervous behaviors, such as
twitching, eye blinks, or facial tics.
38. Nervous movements, twitching, or tics, or
other unusual movements (describe). Rate for spe-
cific nervous behaviors, such as twitching, eye
blinks, or facial tics. Rate item 37 for more gen-
eral behaviors, such as jumpiness, overdriven or
uptight behavior or a demeanor or general feel-
ing of nervous tension from a child. Rate item 9
for a child who fails to sit still, is restless, or is
overactive and item 13 for fidgeting with objects.
Rate item 2 for vocal tics.
40. Too fearful or anxious. Rate for a child
who expresses fears during the observation period
or who appears fearful. Rate item 53 for shyness
or timid behavior.
41. Physically attacks people. Rate when a child
initiates a physical attack or initiates a fight with
peers or adults (e.g., hits a teacher or peer, pushes
or shoves a teacher or peer, throws something at
another person, tries to physically harm another
person, etc.). Item 41 can be rated even when a
child has been provoked to attack, such as having
been called a name. Do not rate for physical at-
tacks that are part of a game or play unless the play
attacks seem excessive. Rate 1 if there is a slight
or ambiguous occurrence or if it is not clear whether
a physical attack or attempt to harm someone else
was intended. Rate item 30 for ongoing physical
fights with peers or adults. Both items 41 and 30
may be rated if a child initiates a physical attack
that then progresses into a physical fight. Do not
rate item 41 for hitting in the course of a physical
fight unless the child clearly initiated the hitting.
44. Apathetic, unmotivated, or wont try. Rate
for an I dont care attitude or an apathetic ap-
proach to tasks or instructions. Rate item 70 when
a child is underactive, seems tired, or is slow mov-
ing.
49. Avoids or is reluctant to do tasks that re-
quire sustained mental effort. Rate for a child who
tries to avoid doing assignments or other tasks that
require effortful thinking or prolonged concentra-
tion. Examples are procrastinating when required
to do difficult or lengthy assignments, such as math
or writing.
50. Self-conscious or easily embarrassed. Rate
for behaviors indicating self-consciousness or em-
barrassment, such as blushing, looking apologetic,
sheepishness, or unusual sensitivity.
51. Slow to respond verbally. Rate for a child
who is slow to answer questions from a teacher or
peers or pauses for an unusual length of time be-
fore saying something. Item 51 can be rated for a
child who seems to need time to think before
responding to questions.
52. Shows off, clowns, or acts silly. Rate for
clowning or silly behavior to attract attention of
peers or adults. Examples are making faces, mak-
ing silly gestures, giggling, or mimicking others to
cause laughter. Rate item 2 for making odd noises.
Rate item 66 for teasing.
53. Shy or timid. Rate for shy demeanor. Do
not rate for characteristics that are covered more
specifically by other items, such as item 50 for self-
conscious or easily embarrassed.
54. Explosive or unpredictable behavior. Rate
for behavior that seems explosive or unpredict-
able, such as emotional outbursts. Rate item 67 for
temper tantrums, hot temper, or angry appearance.
56. Easily distracted by external stimuli. Rate
when a child is distracted by a specific object, noise,
or visual stimulus that takes the child off-task. Ex-
amples are hearing noises or voices in or outside
of the room, hearing or seeing cars, planes, etc.
outside the building, watching other childrens ac-
tivities when a child is supposed to be doing his/
her own work.
57. Stares blankly. Rate when a childs eyes are
not focusing on anything. Rate item 7 for prob-
lems concentrating or item 15 if child appears to
be daydreaming.
58. Speech problem (describe). Rate for articu-
lation problems and other speech difficulties that
make it hard to understand what a child is saying.
Examples are mispronouncing certain speech
sounds (e.g., r, l, th, w), slurred or garbled speech,
halting speech, or unusual grammatical structures.
Rate item 77 for problems in verbal fluency or
when a child has trouble expressing his/her ideas
or desires clearly. Do not rate item 58 for speech
problems due to second language issues (e.g., En-
glish as a second language).
59. Wants to quit or does quit tasks. Rate when
a child expresses a desire to quit a task (e.g., ask-
ing Can I stop now?), gives up, or actually does
quit a task before completing it or quits before time
limits are up.
60. Yawns. Rate 1 for one or two definite or
ambiguous yawns. Rate 2 or 3 for persistent yawn-
ing.
61. Strange behavior. Rate for behavior that
seems very unusual or bizarre. Examples are mak-
ing strange comments about other people, rubbing,
patting or touching other people inappropriately,
or making weird faces that are not intended to be
silly or clowning. If the behavior is more specifi-
cally covered by another item, rate the more spe-
cific item instead, such as item 2 for making odd
noises or item 52 for showing off, clowning, or
acting silly.
62. Stubborn, sullen, or irritable. Rate for a
generally stubborn, sullen, or irritable demeanor.
Rate item 63 for sulking as a reaction to a request
from a teacher, other adult, or peer.
21
63. Sulks. Rate for sulking when it is a reaction
to something that occurs during the observation
period. Rate item 62 for a more general demeanor
of stubbornness, sullenness, or irritability.
64. Swears or uses obscene language. Rate for
words or verbal expressions that generally would
be considered swearing or obscene by teachers or
other adults, including swear words that may have
become relatively common in a given culture or in
modern music. Do not include words that approxi-
mate swear words, such as darn it. Do include
words referring to god when not used as part of a
religious activity and include slang words that other
people would consider offensive.
66. Teases. Rate for physical or verbal teasing.
Rate 1 for playful teasing, such as making silly
faces at someone or tickling. Rate 2 or 3 for more
deliberate teasing or harassing, such as name call-
ing or ridiculing other people.
67. Temper tantrums, hot temper, or seems
angry. Rate for overt temper tantrums or for ex-
pressions of anger or hot temper. Rate item 62 for
sullenness or irritability or grumpy mood.
68. Threatens people. Rate for verbal or physi-
cal threats to other people, including peers and
teachers. This can include when a child tells an
intended victim that he/she is seeking or plotting
revenge, or when a child verbalizes threats to a
third party. The threat can be general, such as I
am going to get you for that, or more specific
threats.
69. Too concerned with neatness, cleanliness,
or order. Rate for behaviors such as excessive ti-
dying of materials or expressed concerns about
getting hands or clothing dirty. Do not rate only
for erasures while drawing or writing, unless era-
sures are excessive or clearly due to overconcern
for neatness.
70. Underactive, slow moving, tired, or lacks
energy. Rate when a childs physical movements
are slowed down, such as being slow in writing,
drawing, or walking across the room. Also rate
when a child looks physically tired or sleepy, or
lacks energy. Rate item 44 for a child who is apa-
thetic or unmotivated. Rate item 60 for yawning.
71. Unhappy, sad, or depressed. Rate for a child
who has an unhappy, sad, or depressed demeanor.
Rate item 12 for a child who cries in response to a
specific event or request, but does not seem gener-
ally unhappy. Rate item 71 if the child looks gen-
erally unhappy or sad, which can or cannot include
crying. If a child cries and looks unhappy, then both
items 12 and 71 can be rated. Rate item 62 for a
child who looks sullen or irritable and item 63 for
a child who sulks in response to a question or a
request from someone. Do not rate item 71 based
only on inferences about a childs feelings if the
child does not display a sad demeanor or appear-
ance of unhappiness.
75. Withdrawn, doesnt get involved with oth-
ers. Rate for a child who appears uninvolved, dis-
tant, or does not interact with peers or staff, or who
withdraws off and on throughout the observation
period. Rate item 34 for physically isolating self
from others.
76. Sucks thumb, fingers, hand, or arm. Rate
for sucking or mouthing thumb, fingers, hand, or
arm. Include chewing on thumb, fingers, hand, or
arm. Rate item 42 for picking or scratching nose,
skin, or body parts. Rate item 24 for mouthing
things that are not food and not body parts.
77. Fails to express self clearly. Rate for prob-
lems in verbal fluency or communicating meaning
or using actions or gestures in place of verbal de-
scriptions. Rate item 58 for specific speech defects
or articulation problems that make speech unclear.
Include problems communicating meaning due to
second language issues (e.g., English as a second
language).
78. Impatient. Rate when a childs comments
or behaviors imply time pressure, such as when a
child wants to know when he/she can move on to
another task or asks when a desired activity will
happen, such as recess or lunch. Rate item 59 when
a child expresses a desire to quit a task or activity.
79. Tattles. Rate for a child who spontaneously
tells teachers or authority figures about rule-break-
ing or wrongful behavior of other children. An
example is telling the teacher that another child
hit him/her or hit someone else. Do not rate when
a child reports wrongful behavior in direct response
to an adults questions about what happened.
80. Repeats behavior over & over; compulsions
(describe). Rate for repetitive, purposeless behav-
iors, such as touching things over and over, rub-
bing hands or arms on a table, making circles on a
table with fingers, or repetitively straightening
things on a desk. Do not rate item 80 for repetition
of acts that are more specifically covered by other
items, such as item 52 for clowning or acting silly
or item 2 for making odd noises. If it is unclear
whether the intensity or nature of the behavior
qualifies as a repetitive act or compulsion, rate 1.
81. Easily led by peers. Rate for a child who
imitates or mimics other children or seems like a
follower. Rate item 66 if mimicking other chil-
dren is done as teasing. Rate also for a child who
asks other children about what to do for general
activities. Do not rate for a child who asks peers
for specific help in assignments.
82. Clumsy, poor motor control. Rate for a child
who has physical difficulty in motor tasks, such as
looking clumsy for his/her age in walking, running,
or jumping. Rate item 89 for other problems for
fine motor problems, such as poor hand writing or
awkward pencil grasp.
83. Doesnt get along with peers. Rate for a
child who doesnt get along with certain children,
even if the child may get along with other chil-
dren. Examples are a child who is rejected by peers
when attempting to join a group or game or a child
who complains of having no friends. Rate other
items for more specific problems getting along with
peers, such as item 3 for arguing and item 30 for
getting into physical fights.
84. Runs out of class (or similar setting). Rate
for a child who runs out of the classroom or an-
other setting (e.g., library, gym, lunch hall) with-
out permission. Rate 1 if the child leaves the set-
ting without permission at a quick pace that might
not be considered running. Do rate for a child
who runs out of the classroom without permission
to go to the bathroom. Do not rate for children who
run out of the classroom for recess.
85. Behaves irresponsibly (describe). Rate for
doing physically dangerous things that are not play-
ful (e.g., poking pencils in electrical outlets) or for
getting into an adults belongings (e.g., taking ob-
jects off the teachers desk or opening desk draw-
ers without permission). Rate item 20 for being
disobedient or noncompliant and/or breaking rules.
Rate items 18 or 19 for deliberately destroying
things (e.g., ripping up papers, breaking pencils).
Rate item 89 for other problems for a child who
steals objects from peers or adults.
86. Bossy. Rate for a child who tells other chil-
dren what to do when not requested or a child who
tries to dominate an activity (e.g., by vehemently
stating rules or making up his/her own rules for an
activity or game.)
87. Complains. Rate for complaints about tasks
or activities and complaints about somatic prob-
lems. Examples are complaining that a task is too
hard or boring, complaining that a task will take
too long, or asking Do we have to do this? in a
complaining tone of voice. Also rate item 87 for a
child who has somatic complaints without known
medical cause, such as dizziness, headaches, or
stomachaches, and for a child who expresses so-
matic complaints, such as complaining that his/
her hand hurts while writing. Rate item 74 for
whining tone of voice. Both items 74 and 87 may
be rated if a child whines and expresses a specific
23
This chapter describes and illustrates the com-
puter-scored DOF Profile. There is no hand-scored
DOF Profile because of the complexity of averag-
ing scores across multiple observation sessions.
The DOF computer-scoring program is a module
in the ASEBA Assessment Data Manager (ADM)
software. It can be purchased separately or as part
of the full ASEBA ADM package, which includes
modules for other ASEBA forms. (For users of the
DOF, the most relevant other ASEBA forms are
the CBCL/6-18, TRF, YSR, SCICA, and TOF.)
With the DOF Module, users enter an observers
ratings of the 89 problem items and On-task for
each 10-minute observation of an identified child
and control children matched to the identified child.
For computer-scoring the DOF Profile, the DOF
Module requires a minimum of two DOFs for the
identified child. DOFs for control children are op-
tional. When observations of control children are
included, there must be at least two DOFs from
observations of one or both control children. Re-
quiring at least two DOFs for the identified child
and two DOFs across one or two control children
is intended to guard against interpretation of DOF
scores based on only one 10-minute observation
of the identified child and only one 10-minute ob-
servation of a control child. The DOF Module al-
lows up to 18 DOFs to be scored as an observa-
tion set for one DOF Profile. Each observation
set can include up to six DOFs for the identified
child, up to six DOFs for the first control child,
and up to six DOFs for the second control child.
The DOF Profile is normed separately for boys and
girls at ages 6-11. The DOF Profile is also normed
separately for classroom observations and for re-
cess observations. Profiles for each of the two set-
tings are described in detail in this chapter.
DOF PROFILE FOR CLASSROOM
OBSERVATIONS
The DOF Profile for classroom observations
consists of 4 pages. Page 1 displays bar graphs of
scores on the five syndrome scales: Sluggish Cog-
nitive Tempo, Immature/Withdrawn, Attention
Problems, Intrusive, and Oppositional. Page 2 dis-
plays bar graphs of Total Problems and On-task
scores, plus a list of items and ratings for Other
Problems not scored on the syndrome scales. Page
3 displays bar graphs of the DSM-oriented Atten-
tion Deficit/Hyperactivity Problems scale and its
Inattention and Hyperactivity-Impulsivity
subscales. Page 4 summarizes descriptive infor-
mation about each DOF that was used to create
the DOF Profile.
Figure 3-1 shows Page 1 of the DOF Profile
scored from classroom observations of 8-year-old
Melinda Brandt. The DOF Profile was based on
four 10-minute observations of Melinda as the
identified child and two 10-minute observations
of each of two control children in the same class.
Figure 2-2 in Chapter 2 showed the observers notes
and on-task ratings for the first 10-minute obser-
vation of Melinda, while Figure 2-3 showed the
observers ratings of the 89 problem items for the
same 10-minute observation.
Descriptive Information
At the top of the DOF Profile in Figure 3-1,
you can see descriptive information about
Melinda and the eight DOFs used to create the
profile. The ID number assigned to Melinda
(200901), plus her name, gender, age, and birthdate
are printed on the left side of the profile. The middle
Chapter 3
Computer-Scored DOF Profile
3. Computer-Scored DOF Profile 24
F
i
g
u
r
e

3
-
1
.

C
o
m
p
u
t
e
r
-
s
c
o
r
e
d

D
O
F

s
y
n
d
r
o
m
e

s
c
a
l
e
s

f
o
r

c
l
a
s
s
r
o
o
m

o
b
s
e
r
v
a
t
i
o
n
s

o
f

8
-
y
e
a
r
-
o
l
d

M
e
l
i
n
d
a

B
r
a
n
d
t
.
25 3. Computer-Scored DOF Profile
column shows the observation period (02/15/07-
02/17/07), indicating the length of time between
the first 10-minute observation and the last 10-
minute observation, which in Melindas case
spanned 3 days.
Observation Set indicates the label (Winter
2007) which the observers supervisor assigned to
the set of eight DOFs used to create the DOF Pro-
file for Melinda. A label for the observation set is
required for computer-scoring the DOF. As ex-
plained in Chapter 2, users can choose whatever
label fits their purpose. For example, you might
choose to obtain one set of observations in the be-
ginning of the school year and another set of ob-
servations later in the year, after implementing an
intervention for the identified child. The label for
the observation set might then be the time frame
for each set of observations, as it was in Melindas
case. Or the label could indicate a specific activity
for different sets of observations (e.g., reading class
versus math class).
The observers name is printed to the right of
the observation period. As best practice, we rec-
ommend that a single observer conduct all obser-
vations of the identified child and control children
to be included in the same observation set. In
Melindas case, the same observer (Valerie Stone)
did the four observations of Melinda and two ob-
servations of each of the two control children. If
two different observers had done the observations,
then this could introduce rater bias into
Melindas DOF scores. This would be particularly
problematic if one observer (e.g., Valerie Stone)
had done all four observations of Melinda, but an-
other observer had done all four observations of
the control children. If this had been the case, you
would not know whether differences between
scores for Melinda versus the control children rep-
resented true differences in the childrens behav-
ior or whether the scores were influenced by how
the two different observers rated the DOFs. If two
different observers are used for observations in the
same observation set, it is important to make sure
that each observer rates both the identified child
and control children so that any potential bias is
distributed across DOFs for the identified child and
control children. The DOF Module allows up to
two observers in the same observation set, and it
prints a warning on the DOF Profile when there
are two observers.
The number of observations used for scoring
the DOF Profile is printed on the far right side of
the profile. For Melinda, you can see that there were
four observations of the identified child (Melinda),
plus two observations of Control Child 1 and two
observations of Control Child 2. Although the DOF
identified child for computer-scoring, we recom-
mend obtaining three to six observations of the
identified child. Whenever possible, we recom-
mend at least two observations of one or two con-
trol children in the same setting, alternating obser-
vations of the identified child and control children.
To sample behavior over different time frames, we
recommend observing the identified child and con-
trol children in the morning and afternoon of at
least two different days. Observers should com-
plete a separate DOF for each 10-minute observa-
tion of each child.
Syndrome Scales
Beneath the descriptive information at the top
of Page 1 in Figure 3-1, you can see bar graphs for
T scores corresponding to the total raw scores for
the five syndrome scales for the identified child,
Melinda Brandt, (dark bar to the left) and the two
control children (lighter bar to right). Ratings are
averaged across the two control children to create
the total raw score for controls. The range of T
scores is shown to the left of the bar graph display.
As explained in Chapter 6, we performed statisti-
cal analyses of DOF problem items to determine
which items tend to occur together to form syn-
dromes. The label for each DOF syndrome scale
summarizes the types of problems that form that
syndrome.
Total Scores, T Scores, and Percentiles for
Syndrome Scales. Beneath the bar graph for each
syndrome scale are the total raw scores, T scores,
and percentiles for the identified child and the two
control children. The averaged ratings of the prob-
lem items comprising each scale are printed be-
low the scores and percentiles, with ratings of the
identified child (ID) in the left column and ratings
for the controls (CRL) in the right column. Abbre-
viated versions of the problem items are listed in
the middle columns below each scale.
On the DOF Profile for Melinda, you can see
that she obtained a total score of 2.5 on the Slug-
gish Cognitive Tempo syndrome, which corre-
sponds to a T score of 67. Melindas T score of 67
falls at the 96
th
percentile for the DOF normative
sample of 6-11-year-old girls. This means that 96%
of the DOF normative sample received a score
equal to or lower than Melindas score of 2.5 on
the Sluggish Cognitive Tempo syndrome. The let-
ter B printed next to the T score of 67 indicates
that Melindas T score on Sluggish Cognitive
Tempo fell within the borderline clinical range for
the normative sample. The two control children
obtained an averaged total score of 1.0 on Slug-
gish Cognitive Tempo, which corresponds to a T
score of 56. The control childrens T score of 56
fell at the 73
rd
percentile, which was in the normal
range for the DOF normative sample. A later sec-
tion describes borderline, clinical, and normal
ranges for scores on the DOF syndrome scales.
In a similar fashion, you can see that Melindas
total score of 12.0 on the Attention Problems syn-
drome had a T score of 74, which was above the
97
th
percentile for the normative sample of 6-11-
year-old girls. This means that over 97% of the
DOF normative sample received a total score equal
to or lower than Melindas score of 12.0 on Atten-
tion Problems. The letter C printed next to
Melindas T score of 74 indicates that her score on
Attention Problems fell within the clinical range
for the normative sample. The two control chil-
dren obtained an averaged total score of 5.5 on
Attention Problems, which corresponds to a T score
of 56, falling at the 73
rd
percentile and in the nor-
mal range for the normative sample.
When you examine the bar graphs, total scores,
T scores, and percentiles for the remaining DOF
syndromes, you can see that Melinda obtained clini-
cal range T scores on the Intrusive (T = 71-C) and
Oppositional (T = 70-C) syndromes, both of which
fell above the 97
th
percentile. Melinda obtained a
much lower T score of 58 on the Immature/With-
drawn syndrome, which fell at the 79
th
percentile
for the normative sample. Melindas T score on
the Immature/Withdrawn syndrome was the same
as the T score for the two control children. The
two control children obtained much lower T scores
than Melinda on the Intrusive (T = 54) and Oppo-
sitional (T = 50) syndromes, which fell at the 65
th
and 50
th
percentiles, respectively. (The DOF Pro-
file shows a T score of 50 and percentile < 50
for the control childrens total score of 0.0 on the
Oppositional syndrome, because T scores are trun-
cated at 50 on the syndrome scales.) Chapter 6 de-
scribes our procedures for assigning T scores to
raw scores for the DOF syndrome scales.
Borderline, Clinical, and Normal Ranges for
Syndrome Scales. The two broken lines on the bar
graph display in Figure 3-1 demarcate borderline
and clinical ranges for judging the deviance of
scores on the five DOF syndromes, as compared
to the normative sample. We identified a border-
line clinical range for each scale because categori-
cal distinctions are usually less reliable for indi-
viduals who score close to the border of a category.
The addition of a borderline range enables practi-
tioners to make more differentiated decisions about
childrens functioning than would be possible if
all scores were categorized as normal versus clini-
cal.
Table 3-1 summarizes the borderline clinical and
clinical ranges for all the DOF scales. The border-
line clinical range for the DOF syndrome scales
spans the 93
rd
to the 97
th
percentiles, which corre-
spond to T scores of 65 to 69. The clinical range is
>97
th
percentile, which corresponds to T scores >69.
Scores that fall in the borderline range between the
two broken lines on the DOF Profile are high
enough to be of concern, but are not as clearly de-
viant as scores that fall in the clinical range above
the top broken line. Scores above the top broken
line indicate that the observer rated enough prob-
lems as present (ratings of 1, 2 or 3) to be of clini-
cal concern. As indicated in Chapter 2, ratings of 1
are for slight or ambiguous occurrences of a prob-
As Figure 3-1 shows, Melindas scores on the
DOF profile fell in the clinical range above the 97
th
percentile on the Attention Problems, Intrusive, and
Oppositional syndrome scales, and in the border-
line clinical range between the 93
rd
and 97
th
per-
centiles on the Sluggish Cognitive Tempo syn-
drome scale. The borderline to clinical range scores
on these four DOF syndromes indicated that
Melinda manifested many more problems than
were typically observed in classrooms for 6-11-
year-old girls in the DOF normative sample.
Melindas score on the Immature/Withdrawn syn-
drome fell within the normal range. The scores for
the two control children fell in the normal range
on all five DOF syndrome scales. The T scores and
percentiles for Melinda provide a standard against
which you can judge deviance in her observed be-
havior relative to a large normative sample of 6-
to-11-year-old girls. The additional T scores and
percentiles for the control children provide a stan-
dard for judging the deviance of Melindas ob-
served behavior relative to other children in the
same classroom setting.
By examining the observers ratings on the spe-
cific problem items, you can see the types of prob-
lems that Melinda showed within each syndrome.
That is, you can see that Melinda received aver-
Table 3-1
Borderline and Clinical Ranges on DOF Scales
DOF Scale Borderline Clinical
Classroom Observations
Empirically Based Syndromes T = 65-69 T >69
Sluggish Cognitive Tempo 93
rd
-97
th
percentile >97
th
percentile
Immature/Withdrawn
Attention Problems
Intrusive
Oppositional
Total Problems-Classroom T = 60-63 T >63
84
th
-90
th
percentile >90
th
percentile
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems T = 65-69 T >69
Inattention Subscale 93
rd
-97
th
percentile >97
th
percentile
Hyperactivity-Impulsivity Subscale
On-task T = 31-35 T <31
3
rd
-7
th
percentile <3
rd
percentile
Recess Observations
Aggressive Behavior T = 65-69 T >69
93
rd
-97
th
percentile >97
th
percentile
Total Problems-Recess T = 60-63 T >63
84
th
-90
th
percentile >90
th
percentile
Note. On the problem scales, high scores warrant concern. Problem scale scores in the borderline range
are high enough to be of concern, but not so clearly deviant as scores in the clinical range. On the On-task
scale, low scores warrant concern.
aged scores of 0.5 to 1.0 for three items on the
Sluggish Cognitive Tempo syndrome, 0.5 for one
item on the Immature/Withdrawn syndrome, 0.5
to 3.0 for six items on the Attention Problems syn-
drome, 0.5 to 3.0 for five items on the Intrusive
syndrome, and 0.5 to 1.5 for five items on the Op-
positional syndrome. The observers ratings of the
control children yielded average scores of 0.5 to
1.5 for 11 items across the five syndrome scales,
with all other items scored 0.0.
Total Problems
Figure 3-2 shows Page 2 of the computer-scored
DOF Profile for Melinda Brandt, which includes
bar graphs, total raw scores, T scores, and percen-
tiles for Total Problems and On-task. The profile
also includes a list of item ratings for Other Prob-
lems that are not scored on the DOF syndrome
scales. The scores shown on this page of the DOF
Profile were derived from the same four DOFs for
Melinda and four DOFs for control children that
were used to score the syndrome scales.
Total Problems. The bar graph on the left side of
the DOF Profile in Figure 3-2 shows the Total Prob-
lems scores for Melinda and the two control chil-
dren. The Total Problems score is the sum of the
averaged 0-1-2-3 ratings for the 89 problem items
on each DOF. The DOF Module separately aver-
ages the item ratings for the identified child
(Melinda) and for the two control children. Total
raw scores from averaged item ratings, T scores,
and percentiles are printed beneath the bar graphs
for the identified child and controls. The scale of T
scores for Total Problems is printed on the left side
of the bar graph.
As you can see in Figure 3-2, Melinda obtained
a total score of 27.0 for Total Problems, which cor-
responds to a T score of 79, falling above the 98
th
percentile for the normative sample of 6-11-year-
old girls. This means that 98% of the normative
sample received a score lower than Melindas score
of 27.0 for Total Problems. The letter C printed
next to the T score of 79 indicates that Melindas T
score fell in the clinical range for Total Problems.
By contrast, the two control children received an
averaged total score of 9.5, which corresponds to
a T score of 55, falling at the 69
th
percentile for the
normative sample of 6-11-year-old girls.
Total Problems. The two broken lines on the bar
graph in Figure 3-2 demarcate borderline and clini-
cal ranges for judging deviance on Total Problems,
as compared to the DOF normative sample. As
shown in Table 3-1, the borderline range for the
Total Problems score spans approximately the 84
th
to the 90
th
percentiles, which correspond to T scores
of 60 to 63. The clinical range is >90
th
percentile,
which corresponds to T scores >63. The border-
line clinical and clinical ranges for Total Problems
are lower than for the DOF syndrome scales be-
cause the Total Problems score is comprised of all
89 problem items (for details, see Chapter 6).
Scores that fall in the borderline clinical range
warrant concern, but are not as clearly deviant as
scores that fall in the clinical range. Scores that
fall below the borderline clinical range (<84
th
per-
centile) are considered to be in the normal range.
As done for the syndrome scales, the letter B is
printed next to T scores that fall in the borderline
clinical range for Total Problems, while the letter
C is printed next to T scores that fall in the clinical
range. Chapter 6 describes procedures for deter-
mining T scores and borderline and clinical
cutpoints for the Total Problems score.
Other Problems
The middle column on Page 2 of the DOF Pro-
file contains a list of items labeled Other Prob-
lems. These are abbreviated versions of the 36
specific problem items, plus open-ended item 89,
which are not included in the five DOF syndrome
scales for classroom observations. Scores for the
37 other problems are included in the Total Prob-
lems score. The averaged ratings for the identified
child (ID) are listed to the left of each item and
averaged ratings for control children (CRL) are
listed to the right of each item. Although the other
problems are not included in the syndrome scale
scores, each of them may be important in its own
F
i
g
u
r
e

3
-
2
.

C
o
m
p
u
t
e
r
-
s
c
o
r
e
d

D
O
F

T
o
t
a
l

P
r
o
b
l
e
m
s
,

O
t
h
e
r

P
r
o
b
l
e
m
s
,

a
n
d

O
n
-
t
a
s
k

f
o
r

8
-
y
e
a
r
-
o
l
d

M
e
l
i
n
d
a

B
r
a
n
d
t
.
right. For example, you can see in Figure 3-2 that
Melinda obtained a score of 1.0 for 28. Out of seat
and 0.5 for 85. Behaves irresponsibly. She obtained
scores of 0.0 for all other items in the Other Prob-
lems list. The two control children obtained a score
of 0.5 for 10. Clings to adults or too dependent
and 1.0 for 28. Out of seat. You should examine
ratings for these Other Problems items, along with
ratings of items comprising the DOF syndrome
scales, to formulate interpretations of DOF results.
On-task
As explained in Chapter 2, observers rate on-
task behavior of the identified child and control
children by marking boxes for on-task or off-task
that represent the last 5 seconds of each 1-minute
interval over the 10-minute observation period.
Total On-task scores can thus range from 0 to 10
for each observation. The DOF Module averages
On-task ratings separately for the identified child
and control children across observation sessions.
Total Scores, T Scores, and Percentiles for On-
task. The bar graph on the right side of Page 2 of
the DOF Profile in Figure 3-2 shows On-task scores
for Melinda Brandt and the two control children.
Separate bars correspond to On-task T scores for
the identified child (dark bar to the left) and the
two control children (lighter bar to the right). Be-
low the bar graph are the mean scores, T scores,
and percentiles obtained by Melinda and the two
control children.
As you can see in Figure 3-2, Melinda obtained
a mean score of 5.5 for On-task. A score of 5.5
corresponds to a T score of 33, which fell at the 5
th
percentile for the normative sample of 6-11-year-
old girls. This means that only 5% of the norma-
tive sample received a score equal to or lower than
Melindas score of 5.5 for On-task. The letter B
printed next to the T score of 33 indicates that
Melindas T score fell in the borderline clinical
range for On-task. The two control children ob-
tained a mean score of 9.5 for On-task. A score of
9.5 corresponds to a T score of 51, which fell at
the 54
th
percentile for the normative sample of 6-
11-year-old girls. This means that 54% of the nor-
mative sample received a score equal to or lower
than the control childrens score of 9.5 for On-task.
Because On-task raw scores range from 0 to 10,
the mean scores can be easily translated into per-
centages for evaluation reports. Thus, you can see
that across the four 10-minute observation sessions,
Melinda was on-task an average of only 55% of
the time. By contrast, the two control children were
on-task an average of 95% of the time.
On-Task. The two broken lines on the bar graph
for On-task in Figure 3-2 demarcate borderline
clinical and clinical ranges for judging deviance
for On-task compared to the DOF normative
sample. Low scores for On-task warrant concern,
in contrast to high scores on the problem scales.
As shown in Table 3-1, the borderline clinical range
for On-task scores spans approximately the 3
rd
to
7
th
percentiles, which correspond to T scores of 31
to 35. The clinical range is <3
rd
percentile, which
corresponds to T scores <31. Scores that fall above
the borderline clinical range (>7
th
percentile) are
considered to be in the normal range. Chapter 6
describes procedures for assigning T scores and
borderline and clinical cutpoints to On-task scores.
DSM-Oriented Attention Deficit/Hyperac-
tivity Problems and Inattention and Hyper-
activity-Impulsivity Subscales
Childrens problems can also be viewed from
the perspectives of formal diagnostic systems. The
dominant system in the United States is embodied
in the American Psychiatric Associations DSM-
IV and DSM-IV-TR. The DSMs diagnostic cat-
egories are intended to serve many purposes. Un-
like the syndromes derived statistically from the
DOF, DSM diagnostic categories for behavioral
and emotional problems are not derived directly
from problem scores obtained from standardized
assessment. Nevertheless, assessment instruments
like the DOF, and other ASEBA forms like the
CBCL/6-18, TRF, YSR, SCICA, and TOF, are of-
ten used to obtain data on which to base diagnoses.
The CBCL/6-18, TRF, YSR, and SCICA pro-
files include several DSM-oriented scales compris-
ing problem items judged by experienced psychia-
trists and psychologists to be very consistent with
DSM-IV diagnostic categories (for details, see
Achenbach & Rescorla, 2001 and McConaughy &
Achenbach, 2001). The DOF Profile has one DSM-
oriented scale, the Attention Deficit/Hyperactivity
Problems scale, which can be scored from class-
room observations.
To create the DOF Attention Deficit/Hyperac-
tivity Problems scale, we selected DOF items that
were similar to other ASEBA items judged to be
very consistent with DSM-IV symptoms of ADHD.
We also added new DOF items that are similar to
ADHD symptoms that did not have counterparts
among other ASEBA items. Chapter 6 presents
details of how we constructed the DOF Attention
Deficit/Hyperactivity Problems scale.
Figure 3-3 shows Page 3 of the computer-scored
DOF Profile, scored for 8-year-old Melinda Brandt
and two control children in the same classroom.
Total raw scores for the Attention Deficit/Hyper-
activity Problems scale and Inattention and Hyper-
activity-Impulsivity subscales were derived from
the same four DOFs for Melinda and two DOFs
for each control child, as done for the other DOF
scales.
DSM-oriented Scales.As you can see in Figure 3-3,
the DOF Module prints separate bar graphs for T
scores on the DSM-oriented Attention Deficit/Hy-
peractivity Problems scale and subscales for the
identified child (dark bar to the left) and averaged
controls (lighter bar to right). The range of T scores
is shown to the left of the first bar graph. Beneath
the bar graphs, are the total raw scores, T scores,
and percentiles for the identified child and control
children for each scale. The Attention Deficit/Hy-
peractivity Problems scale includes 23 items, of
which 10 comprise the Inattention subscale and 13
comprise the Hyperactivity-Impulsivity subscale.
The Attention Deficit/Hyperactivity Problems to-
tal score equals the sum of the Inattention and Hy-
peractivity-Impulsivity subscale total scores. The
averaged ratings of items comprising each of the
two subscales are printed below the total scores, T
scores, and percentiles for the subscales, with rat-
ings of the identified child (ID) in the left column
and ratings for the controls (CRL) in the right col-
umn.
Figure 3-3 shows that Melinda obtained a total
score of 17.5 on the Attention Deficit/Hyperactiv-
ity Problems scale, which corresponds to a T score
of 72, falling above the 97
th
percentile for the DOF
normative sample of 6-11-year-old girls. The let-
ter C printed next to the T score of 72 indicates
that Melindas T score on Attention Deficit/Hyper-
activity Problems fell within the clinical range for
the normative sample. The two control children
obtained a total score of 5.0 on Attention Deficit/
Hyperactivity Problems, which corresponds to a T
score of 53, falling at the 62
nd
percentile for the
normative sample of 6-11-year-old girls.
The two sections of bar graphs to the right of
graphs for Attention Deficit /Hyperactivity Prob-
lems show how Melinda and the two control chil-
dren scored on the Inattention and Hyperactivity-
Impulsivity subscales. You can see that Melinda
obtained a total score of 5.5 on the Inattention
subscale, which corresponds to a T score of 68,
falling at the 97
th
percentile for the normative
sample of 6-11-year-old girls. The letter B printed
next to Melindas T score of 68 indicates that her
score on Inattention was within the borderline clini-
cal range for the normative sample. The two con-
trol children obtained a total score of 2.0 on Inat-
tention, which corresponds to a T score of 55, fall-
ing at the 69
th
percentile for the normative sample.
Melinda obtained a total score of 12.0 on the Hy-
peractivity-Impulsivity subscale, which corre-
sponds to a T score of 73, falling above the 97
th
percentile. The letter C printed next to Melindas
T score of 73 indicates that her score on Hyperac-
tivity-Impulsivity was in the clinical range for the
normative sample. The two control children ob-
tained a total score of 3.0, corres-ponding to a T
score of 52, falling at the 58
th
percentile.
Attention Deficit/Hyperactivity Problems and
Subscales. The two broken lines on the bar graphs
F
i
g
u
r
e

3
-
3
.

C
o
m
p
u
t
e
r
-
s
c
o
r
e
d

D
O
F

A
t
t
e
n
t
i
o
n

D
e
f
i
c
i
t
/
H
y
p
e
r
a
c
t
i
v
i
t
y

P
r
o
b
l
e
m
s

a
n
d

I
n
a
t
t
e
n
t
i
o
n

a
n
d

H
y
p
e
r
a
c
t
i
v
i
t
y
-
I
m
p
u
l
s
i
v
i
t
y

s
u
b
s
c
a
l
e
s
f
o
r

8
-
y
e
a
r
-
o
l
d

M
e
l
i
n
d
a

B
r
a
n
d
t
.
in Figure 3-3 demarcate borderline clinical and
clinical ranges for the DSM-oriented scale and
subscales. As shown in Table 3-1, the borderline
range for the DSM-Oriented Attention Deficit/
Hyperactivity Problems scale and the Inattention
and Hyperactivity-Impulsivity subscales spans the
93
rd
to 97
th
percentiles, which correspond to T
scores of 65 to 69. The clinical range is >97
th
per-
centile, which corresponds to T scores >69. Scores
falling below a T score of 65 and below the 93
rd
percentile are considered to be in the normal range.
As indicated in the previous section, Melindas
scores on the Attention Deficit/Hyperactivity Prob-
lems scale and the Hyperactivity/Impulsivity
subscale were in the clinical range, while her score
on the Inattention subscale was in the borderline
clinical range. Scores for the two control children
were in the normal range.
Summary Report for Classroom
Observations
Figure 3-4 shows Page 4 for the DOF Profile
derived from classroom observations of Melinda
Brandt and the two control children. Page 4 is a
Summary Report that provides descriptive infor-
mation about each of the four DOFs for Melinda
and four DOFs for the control children that were
used to score the DOF Profile. You can see in the
Summary Report that the gender of the two con-
trol children was the same as for the identified child
(female) and that the same observer (Valerie Stone)
completed all eight DOFs. The observations were
done in the morning and afternoon of two differ-
ent days (02/15/07 and 02/17/07) and during a va-
riety of classroom activities (reading, social stud-
ies, class meeting, and math). Chapter 5 discusses
Melindas case in more detail, including reports
from her mother and teacher about her behavior at
home and at school.
Narrative Report for Classroom
Observations
In addition to printing the DOF Profile and Sum-
mary Report, the DOF Module gives users the op-
tion of printing a Narrative Report that summa-
rizes scale scores on the DOF Profile for the iden-
tified child and control children. You can easily
import the Narrative Report into a word process-
ing document when writing evaluation reports. This
not only makes report writing more efficient, but
also guarantees the accuracy of the scores cited for
each DOF scale. Another option is to include the
DOF Narrative Report as an addendum to evalua-
tion reports and/or as a note in a childs case record.
Figure 3-5 shows the DOF Narrative Report for
observations of Melinda Brandt.
DOF PROFILE FOR RECESS
OBSERVATIONS
The DOF Profile for recess observations con-
sists of 2 pages. Page 1 displays bar graphs for the
Aggressive Behavior syndrome scale and Total
Problems, plus a list of items and ratings for Other
Problems not scored on the Aggressive Behavior
syndrome. Page 2 is a Summary Report with de-
scriptive information about the DOFs that were
used to score the DOF Profile.
Figure 3-6 shows Page 1 of the DOF Profile for
recess observations, scored for 9-year-old Ricky
Johnson (not his real name) and two control chil-
dren. As you can see in the column on the right
side at the top of the profile, the observer (Harry
Provo) obtained six observations of Ricky, four
observations of Control Child 1, and two observa-
tions of Control Child 2. The observations were
done over an 8-day observation period from 10/
09/07 to 10/16/07. The observer labeled the obser-
vation set for the twelve DOFs as Playground Fall
2007.
Aggressive Behavior Syndrome Scale
The bar graph on the left side of the DOF Pro-
file in Figure 3-6 shows scores on the Aggressive
Behavior Syndrome scale for Ricky Johnson, the
identified child, (dark bar on the left) and the two
control children (lighter bar on the right). (Chap-
ter 6 describes our factor analyses to derive the
Aggressive Behavior syndrome scale.)
F
i
g
u
r
e

3
-
4
.

S
u
m
m
a
r
y

R
e
p
o
r
t

o
f

D
O
F
s

f
o
r

c
l
a
s
s
r
o
o
m

o
b
s
e
r
v
a
t
i
o
n
s

o
f

8
-
y
e
a
r
-
o
l
d

M
e
l
i
n
d
a

B
r
a
n
d
t
.
Figure 3-5. Narrative Report summarizing DOF results for classroom observations of 8-year-old
Melinda Brandt.
F
i
g
u
r
e

3
-
6
.

C
o
m
p
u
t
e
r
-
s
c
o
r
e
d

D
O
F

P
r
o
f
i
l
e

f
o
r

r
e
c
e
s
s

o
b
s
e
r
v
a
t
i
o
n
s

o
f

9
-
y
e
a
r
-
o
l
d

R
i
c
k
y

J
o
h
n
s
o
n
.
Aggressive Behavior. Beneath the bar graph, you
can see the total raw scores, T scores, and percen-
tiles for the identified child and control children.
The range of T scores is shown to the left of the
bar graph. The averaged ratings of each of the items
comprising the Aggressive Behavior syndrome are
printed below the scores and percentiles, with rat-
ings of the identified child (ID) in the left column
and ratings for the controls (CRL) in the right col-
umn.
On the DOF Profile for Ricky, you can see that
he obtained a total score of 5.5 on the Aggressive
Behavior syndrome, which corresponds to a T score
of 74, which was above the 97
th
percentile for the
DOF normative sample of 6-11-year-old boys. This
means that at least 97% of the DOF normative
sample received a score lower than Rickys score
of 5.5 on Aggressive Behavior. The letter C indi-
cates that Rickys T score of 74 fell within the clini-
cal range for the normative sample. The two con-
trol children obtained an averaged total score of
1.0 on Aggressive Behavior, which corresponds to
a T score of 58, falling at the 79
th
percentile for the
DOF normative sample of 6-11-year-old boys.
When you examine the averaged item scores
listed below the bar graph, you can see that the
observer rated 6 of 9 problems as present for Ricky:
14. Cruel, bullies, or mean to others; 30. Gets into
physical fights; 31. Gets teased; 47. Screams; 66.
Teases; and 86. Bossy. The labels are abbreviated
versions of the problem items. The control chil-
dren, by contrast, were rated 0.0 on all items ex-
cept 47. Screams.
Aggressive Behavior. The two broken lines on the
bar graph for Aggressive Behavior in Figure 3-6
demarcate borderline and clinical ranges for judg-
ing the deviance of scores, as compared to the nor-
mative sample. As shown in Table 3-1, the border-
line clinical range for the DOF Aggressive Behav-
ior syndrome scale spans the 93
rd
to the 97
th
per-
centiles, which corresponds to T scores of 65 to
69. The clinical range is >97
th
percentile, which
corresponds to T scores >69. Scores that fall in the
borderline clinical range between the two broken
lines on the DOF Profile are high enough to be of
concern, but are not as clearly deviant as scores
that fall in the clinical range above the top broken
line. The DOF Module prints the letter B next to T
scores that fall in the borderline clinical range and
the letter C next to T scores that fall in the clinical
range. Scores that fall below the borderline clini-
cal range (<93
rd
percentile) are considered to be in
the normal range. Chapter 6 describes procedures
for assigning borderline clinical and clinical
cutpoints to the DOF Aggressive Behavior syn-
drome scale.
Total Problems
The bar graph to the right of Aggressive Be-
havior in Figure 3-6 shows the Total Problems
scores for Ricky and the two control children.
Total Problems. The Total Problems score is the
sum of the averaged 0-1-2-3 ratings for 88 prob-
lem items on each DOF. (Item 28. Out of seat is
not included in the Total Problems score for recess
observations.) The DOF Module separately aver-
ages the item ratings for the identified child (Ricky)
and the two control children. Total raw scores from
averaged item ratings, T scores, and percentiles are
printed beneath the bar graph for the identified child
and controls.
As you can see in Figure 3-6, Ricky obtained a
total score of 11.0 for Total Problems, which cor-
responds to a T score of 77, which was above the
98
th
percentile for the normative sample of 6-11-
year-old boys. The letter C printed next to the T
score of 77 indicates that Rickys T score fell in
the clinical range for Total Problems. By contrast,
the two control children received an averaged To-
tal Problems score of 3.5, which corresponds to a
T score of 62, falling at the 89
th
percentile for the
normative sample of 6-11-year-old boys. The let-
ter B printed next to the T score of 62 indicates
that the control childrens T score fell in the bor-
derline clinical range for Total Problems.
Total Problems. The two broken lines on the bar
graph display in Figure 3-6 demarcate borderline
and clinical ranges for judging deviance on Total
Problems, as compared to the DOF normative
sample. As shown in Table 3-1, the borderline range
for the Total Problems score spans approximately
the 84
th
to the 90
th
percentiles, which correspond
to T scores of 60 to 63. The clinical range is >90
th
percentile, which corresponds to T scores >63. You
can see on the profile that the borderline clinical
and clinical ranges for Total Problems are lower
than for the DOF Aggressive Behavior syndrome
scale. This is because the Total Problems score for
recess observations comprises 88 problem items,
in contrast to 9 items for Aggressive Behavior.
Scores that fall below the borderline clinical range
(<84
th
percentile) are considered to be in the nor-
mal range. As done for Aggressive Behavior, the
letter B is printed next to T scores that fall in the
borderline clinical range for Total Problems, while
the letter C is printed next to T scores that fall in
the clinical range.
Other Problems
A list of items labeled Other Problems is
printed on the right side of the DOF Profile for
recess observations. These are abbreviated versions
of the 78 specific problem items, plus open-ended
item 89, which are not included in the Aggressive
Behavior syndrome scale. The averaged ratings for
the identified child (ID) are listed to the left of each
item and averaged ratings for control children
(CRL) are listed to the right of each item. Scores
for the Other Problems items are included in the
Total Problems score. As you can see in Figure 3-
6, Ricky obtained scores of 0.5 to 1.0 for 3. Ar-
gues; 8. Difficulty waiting turn in activities or tasks;
20. Disobedient; 22. Doesnt seem to feel guilty
after misbehaving; 67. Temper tantrums, hot tem-
per, or seems angry; and 83. Doesnt get along with
peers. He obtained scores of 0.0 for all other items
in the Other Problems list. The two control chil-
dren obtained scores of 0.5 to 1.0 for 8. Difficulty
waiting turn in activities or tasks; 75. Withdrawn,
doesnt get involved with others; 82. Clumsy, poor
motor control; and 87. Complains. You should ex-
amine ratings for these Other Problems items, along
with ratings of items comprising the DOF Aggres-
sive Behavior syndrome, to formulate interpreta-
tions of DOF results.
Summary Report for Recess
Observations
Figure 3-7 shows Page 2 for the DOF Profile
derived from recess observations of Ricky Johnson
and the two control children. Page 2 is a Summary
Report that provides descriptive information about
each of the six DOFs for Ricky and six DOFs for
the control children that were used to score the DOF
Profile. You can see in the Summary Report that
the observations spanned an 8-day period from 10/
09/07 to 10/16/07. The observer (Harry Provo) did
two observations of Ricky on each of the three days,
four observations of Control Child 1 on each of
the first two days, and two observations of Con-
trol Child 2 on the third day. All of the observa-
tions were conducted on the playground. Chapter
5 discusses Rickys case in more detail, including
reports about his behavior from his mother and
teacher.
Narrative Report for Recess
Observations
Figure 3-8 shows the DOF Narrative Report
summarizing scale scores on the DOF Profile for
recess observations of Ricky Johnson. As indicated
earlier, you can easily import the Narrative Report
into a word processing document for evaluation
reports. Or you can include the Narrative Report
as an addendum to evaluation reports and/or as a
note in a childs case record.
SUMMARY
The computer-scored DOF Profile provides a
visual, quantitative picture of childrens problems
rated by observers in classrooms or at recess. The
DOF can be scored only by computer because av-
eraging scores across multiple observation sessions
would be too complex for hand-scoring. The DOF
identified child for computer-scoring. DOFs for
control children are optional, but recommended in
order to provide a standard for evaluating the iden-
F
i
g
u
r
e

3
-
7
.

S
u
m
m
a
r
y

R
e
p
o
r
t

o
f

D
O
F
s

f
o
r

r
e
c
e
s
s

o
b
s
e
r
v
a
t
i
o
n
s

o
f

9
-
y
e
a
r
-
o
l
d

R
i
c
k
y

J
o
h
n
s
o
n
.
Figure 3-8. Narrative Report of summarizing DOF results for recess observations of 9-year-
old Ricky Johnson.
41
As indicated in earlier chapters, the DOF is de-
signed for rating observations of childrens behav-
ior in school classrooms, at recess, and other group
settings. Observers should have some knowledge
of child behavior and development, as well as
theory and methodology of behavioral assessment.
Observers may be paraprofessionals, such as teach-
ers aides, undergraduate and graduate students, re-
search assistants, and professionals in education,
school psychology, clinical psychology, and related
disciplines. Page iii describes user qualifications
for the DOF. For professionals with training in stan-
dardized assessment, a thorough understanding of
the procedures described in this Manual is usually
sufficient for using the DOF and interpreting the
DOF Profile. Paraprofessionals, students, and re-
search assistants will require supervision and train-
ing by a qualified professional. In this chapter, we
provide guidelines for training DOF observers and
conducting observations in school settings. We also
discuss procedures for assessing inter-observer
agreement and inter-rater reliability.
TRAINING DOF OBSERVERS
All users should read Chapters 1 through 3 of
this Manual to learn about the DOF and the com-
puter-scored DOF Profile. Chapter 2 provides in-
structions for rating the DOF problem items and
the childs on-task behavior. Users should pay spe-
cial attention to the Guidelines for Rating Specific
DOF Problem Items in Chapter 2. Brief instruc-
tions are also provided on the DOF.
As a first step in training DOF observers, su-
pervisors should meet with them to discuss the
DOF rating procedures described in Chapter 2. It
is also good to provide copies of the Guidelines
for Rating Specific DOF Problem Items for ob-
servers to take with them to observation sites. After
initial training with the DOF Manual, trainees
should practice DOF recording and rating proce-
dures through paired observations of children in
school classrooms or comparable group settings.
One observer can be a trainee and the second can
be an experienced DOF observer. Or two trainees
can practice together and then meet with an expe-
rienced DOF observer as trainer. Another option
is to have pairs of trainees view videotapes of chil-
dren in group settings and then discuss their ob-
servations and ratings with the trainer. We used
both approaches in our research to develop the
DOF. In either approach, responsible trainers must
adhere to requirements specified by an appropri-
ate institutional review board or school adminis-
trative office to obtain proper permission for di-
rect observations and/or videotapes of practice
cases for training purposes.
The two observers should each observe the same
practice case for the same 10-minute period. Dur-
ing the 10 minutes, each observer writes a narra-
tive description of the childs behavior and rates
the child as being on-task or off-task at the end of
each 1-minute interval, as instructed on the DOF
and in Chapter 2. At the end of the 10-minute ob-
servation period, without discussing their obser-
vations, each observer then rates the child on the
89 DOF problem items. After completing their
DOFs, the two observers should compare their rat-
ings for on-task and their ratings for the problem
items and should discuss any discrepancies be-
tween their observations and ratings. However,
observers should make no changes on any of their
DOF ratings based on these comparisons. The two
observers should then select a new child for an-
other 10-minute observation, following the same
procedure as for the first child. We recommend that
Chapter 4
Training DOF Observers and Conducting
School Observations
4. Training DOF Observers and Conducting School Observations 42
pairs of observers select at least five children as
practice cases for training on the DOF. Observers
should rate each child on a separate DOF.
After completing their paired observations of
the practice cases, trainees should meet with the
trainer to discuss discrepancies in their on-task and
item ratings, referring to the instructions and guide-
lines in Chapter 2. Thereafter, pairs of trainees, or
a trainee and an experienced observer, should ob-
serve and rate additional children until good agree-
ment is reached. We recommend making paired
observations of at least 5 to 10 practice cases for
training purposes. When the DOF is used in re-
search protocols, we recommend paired observa-
tions on additional cases to assess inter-observer
agreement and inter-rater reliability, as discussed
in later sections.
GUIDELINES FOR OBSERVATIONS
IN SCHOOLS
Because the DOF is designed primarily for ob-
serving children in school settings, we offer the
following guidelines for using the DOF in schools.
Trainers, supervisors, and researchers can adapt
these guidelines to fit their own procedures and
particular schools.
Obtaining Permission for Observations
All users of the DOF should comply with school
policies regarding parental permission for direct
observations of children. Observations of practice
cases and control children do not require observ-
ers to know the name of the child. For these obser-
vations, passive consent may be all that is required.
For example, a principal or other school adminis-
trator may contact parents by letter to inform them
that children will be observed anonymously at
school and explain the purpose of the observations.
Parents can then contact the school administrator
if they do not want their child observed. If an iden-
tified child is observed as part of a formal assess-
ment, such as a comprehensive special education
evaluation, then the evaluator and appropriate
school staff must follow procedures for obtaining
parental permission. For research, the principal
investigator must obtain approval from an institu-
tional review board to conduct direct observations
of children as part of a research protocol. School-
based assessments and research protocols usually
require letters and forms for obtaining informed
consent from parents.
Supervisors and researchers should create a
standard form that indicates to the childs teacher
that the parent of an identified child has given per-
mission for the child to be observed. This form
should adhere to policies of the school administra-
tive office or an appropriate institutional review
board, depending on the purpose of the observa-
tions. The observer can then present this form to
the teacher before the first observation.
Scheduling Observations
Supervisors and observers should develop a
standard procedure for scheduling observations
with teachers and other relevant school personnel.
Scheduling can be done by the supervisor, a desig-
nated assistant, or the observer. Supervisors and
observers should determine the best format for
regular communication, (e.g., e-mail, phone voice
mail, cell phone). The supervisor should provide
the necessary information for each scheduled ob-
servation: childs name, childs teacher, name of
the school, and directions to the school. Some su-
pervisors may want to complete all of the demo-
graphic information about the child on Page
1 of the DOF. Others may depend on the observer
to complete the information after the observation.
If an observer is sick or otherwise unable to com-
plete a scheduled observation, he/she should con-
tact the person scheduling observations as soon as
possible to reschedule the observation. Observa-
tions of the identified child and control children
should all be done within 1 to 2 weeks, whenever
feasible. The next sections present guidelines for
observers in school settings. Supervisors and train-
ers may adapt these guidelines to fit their needs
and setting.
Guidelines for DOF Observers in Schools
Prior to Arrival at a School
43 4. Training DOF Observers and Conducting School Observations
Whenever possible, call the school the morn-
ing of your scheduled observation to make sure
the child is in school that day. If inclement
weather is likely, listen to the radio the morn-
ing of the observation to determine whether
school has been cancelled.
Regardless of the weather, plan to arrive at
least 15 minutes early on the first observation
day to allow time to visit the school office and
introduce yourself to the teacher. You may not
need to arrive as early on your second observa-
tion day.
Professional Dress
Dress professionally. Do not wear shorts, jeans,
t-shirts, shirts that expose the midriff, or other
very casual clothing. Do not wear sleeveless
shirts or tank tops without jackets because these
are sometimes against school dress codes. Wear-
ing earrings is acceptable, but remove other
visible piercings for the observation session.
Avoid clothing or accessories with slogans be-
cause some schools do not allow them.
Because some children and adults have aller-
gies, do not wear perfume or scented lotions
(e.g., cologne, lotion, or after-shave).
Beginning Observations
When you enter the school building, go to the
schools office. Introduce yourself to the secre-
tary and inquire about the procedure for visit-
ing the school. Many schools will require you
to sign in and/or wear a visitor nametag. In-
troduce yourself to the principal if he/she is
readily available.
Ask at the office for directions to the childs
classroom. You may also want to inquire at the
office if the childs teacher is aware that you
will be observing the child that day.
When you arrive at the classroom, wait for an
appropriate moment to introduce yourself pri-
vately to the teacher. For instance, you might
say,
Im Jane Doe. The family of a child
in your classroom has granted permis-
sion for me to observe him/her. My su-
pervisor [or I] contacted you to arrange
the time for this observation. Show
the permission form with the childs
name to the teacher. This is a good way
to identify the child without stating the
childs name aloud.
Ask the teacher to quietly point out the identi-
fiedchild. Make sure that the child does not see
that he/she is being singled out. Tell the teacher
that you are not supposed to know anything
about the child to ensure that the teacher does
not provide background or other information
that may influence your observations.
When you begin your observation, fold the DOF
so that Page 1 with the identified childs name
is not visible. If the identified childs name is
not on the DOF, you can wait to write in the
childs full name until after you have completed
your observation and left the room so that no
one will see the name of the child being ob-
served.
If the teacher would like to introduce you in
the classroom, ask the teacher to avoid indicat-
ing that you will observe a specific child. For
instance, the teacher could say:
This is Jane. She is here to learn about
what we do in second grade. So she is
just going to watch our class for a little
while.
Some children may be curious and might ask
what you are doing. If this happens, just give a
very general comment about observing the
class. For example, you might say:
I havent been in second grade for a
long time, so I am here to see what
children do in second grade.
If a child asks what you are writing, explain
that you are making notes to remind yourself
of what you saw. Do not show children the DOF,
but dont try to hide it in ways that might raise
suspicions. Make sure the identified childs
name cannot be seen. Also make sure that the
permission form with the identified childs
name is not visible.
Some children might want to show you things,
such as their schoolwork, drawings or stories.
You can briefly acknowledge these with a nod
or smile, but dont be overly encouraging. You
wont be able to properly observe if you are
interacting with children. If necessary, move to
a place farther away from a child who wants to
interact. If a child is insistent on showing some-
thing, you might say:
Your drawing is great. Thank you
for showing it to me. I cant talk to
you about it now because my job is
to watch what everyone is doing, so
that I can really learn about what goes
on in the whole class.
Observing the Identified Child
Find a place in the classroom where you can
observe the identified child unobtrusively, but
can clearly see what the child is doing, includ-
ing seeing the childs face. Do not make it ob-
vious which child you are watching. If the child
moves to another part of the room, you can
move to a new spot to see better. When you do
move, try to do so without calling attention to
yourself.
For classroom observations, observe the iden-
tified child during an academic activity. Aca-
demic activities include math, reading, social
studies, and science, but may also include in-
dependent seatwork, circle discussions for
young children, and other learning activities.
If the class activity is not an academic activ-
itysuch as snack, free time, an assembly, or
a special (e.g., gym, music), or a birthday
partywait for normal classroom academic ac-
tivities. You may continue an on-going obser-
vation during a very brief non-academic activ-
ity, such as snack time or lining up to get mate-
rials or changing activities (e.g., moving from
a reading group back to a students desk to do
math). You can write the activity in the spaces
on Page 2 where you make your notes of your
observations. If there is more than one activity
during a 10-minute observation, write the pre-
dominant activity in the space next to Setting
on Page 1 of the DOF.
Observe the identified child for a full 10 min-
utes. If the child leaves the classroom for any
reason (e.g., bathroom, drink of water) during
this time, stop observing, and mark the break
on the DOFwhere you are writing your notes.
Begin observing again 30 seconds after the
child returns to the classroom to allow the child
time to settle in. You do not have to start an
observation session over again if breaks like
this occur after the first 2 minutes of observa-
tion. For example, you might obtain 6 minutes
of observations before a child leaves the class-
room, then wait 2 minutes for the child to re-
turn to class, allow the child to settle in for 30
seconds, and then finish the additional 4 min-
utes of the 10-minute observation period.
If you have observed the child for 2 minutes or
less and then cannot complete the remaining 8
minutes of observation, discontinue that obser-
vation session and begin a new 10-minute ob-
servation with a new DOF. For example, you
may begin an observation in the last minute of
an academic period and then the child leaves
the classroom for a one-hour break for lunch
and recess. In such a case, you should begin a
new 10-minute observation when the child re-
turns to the classroom after the break.
After each 10-minute observation, stop observ-
ing and complete your ratings of the 89 DOF
items before beginning a new observation.
As indicated above, try to get a full 10-minute
observation for each DOF. If you obtain at least
8 to 9 minutes of an observation, but then can-
not complete the remaining 1 to 2 minutes in
the same time period (i.e., the morning or af-
ternoon of the same day), you may count that 8
to 9 minute session as a complete observation.
However, an 8-to-9-minute observation, in place
of 10 minutes, should be very rare.
If the identified child shows extremely unusual
behavior, proceed with the observation anyway.
After all your observations and DOF ratings
are completed, you may ask the teacher pri-
vately if the childs behavior was very unusual
and note the teachers answer on the DOF.
However, consultation with teachers is gener-
ally not encouraged because it may lead to in-
formation that will compromise your status as
an independent observer. Consult your super-
visor after the observation if you have concerns
about the child.
Repeat the observation procedures using a new
DOF for the next 10-minute observation of the
identified child.
On Page 1 of each DOF, write the identified
childs full name. As indicated earlier, you can
wait to write the identified childs full name
until after you leave the room.
Write your first and last name in the box for
Observers Name.
Write the number of each observation in the
box for Observation # in the right hand cor-
ner of Page 1 of the DOF. Number observation
sessions for each identified child consecutively
over the observation sessions (e.g., 1, 2, 3, 4,
5, 6).
In the section for Observed Child, check the
box for Identified child to indicate each DOF
that was completed for the identified child.
Write the time you begin each observation in
the box for Time of Day.
Write the date of the observation in the box for
Todays Date.
Check one box for Setting: Class or Recess.
You can also write in the predominant activity
during the observation session.
If not done already, record other information
on Page 1 of each DOF for the identified child:
childs gender, age, ethnic group or race, and
grade or level in school, and childs birthdate
(if known). If you do not know some of this
information, have your supervisor complete
those sections. You or your supervisor can en-
ter an ID # for the identified child on each DOF,
if not done already. Each identified child should
have a unique ID # that is the same for all DOFs
for that child and any matched controls. You
or your supervisor can also assign a label for
Observation Set. See Page 4 of the DOF for
instructions for completing information on
Page 1.
Observing Control Children
For each identified child, select one or two con-
trol children of the same gender and age in the
same classroom. You do not have to know the
names of the control children. Choose control
children who do not sit near the identified child
and do not interact with the identified child.
For example, you can choose a child who sits
at a diagonal across the room from the identi-
fied child, or who sits in a group of other chil-
dren across the room. Try to use the same strat-
egy each time for choosing control children. If
a control child does interact with the identified
child, indicate this in your notes on Page 2 of
the DOF, but continue your observation of the
control child for the full 10 minutes.
Do not select the control child on the basis of
particular behaviors that the child displays, be-
cause the control child should be an anonymous
random selection. The only exclusions for
control children are obvious physical disabili-
ties (e.g., in a wheel chair or has a broken arm)
or mental disabilities or mental retardation
(e.g., Down syndrome, seizure disorder).
Write observations for each control child in the
same way that you wrote observations of the
identified child. Use a separate DOF for each
10-minute observation of each control child.
We recommend observing two control children
for each identified child, if possible.
Try to alternate observations of the control child
with observations of the identified child. That
is, if you are observing two control children on
the same day as the identified child: observe
Control Child 1, then observe the Identified
Child, then observe Control Child 2; then ob-
serve Control Child 1 again, then observe the
Identified Child again, then observe Control
Child 2 again. This is the ideal sequence for
observations, but it may not always be possible
on the same day. For example, for easier sched-
uling, you and your supervisor may decide to
obtain several DOFs for one control child on
the same day and then several DOFs for a sec-
ond control child on a different day. If you do
this, you should still alternate observations of
the control child with observations of the iden-
tified child.
In the section for Identified Childs Name
on Page 1 of the DOF, write a brief description
of each control child (e.g., girl with short blond
hair, boy with dark curly hair). Or write an ab-
breviation to show the link between the identi-
fied childs name and the control child (e.g., if
the identified child is John Eric Smith, Con-
trol Child 1 might be labeled JES-C1). This
will provide an additional check to be sure
which DOFs belong to which control child.
Write your first and last name in the box for
Observers Name.
Write the number of each observation of each
control child (e.g., 1, 2, 3, 4) in the box for
Observation # in the right hand corner of
Page 1 of the DOF. You can add the letter C
to each observation number (1C, 2C, 3C, 4C)
as a double check for indicating controls ver-
sus identified children.
In the section for Observed Child, check the
box for Control Child 1 to indicate each DOF
that was completed for the first control child.
Check the box for Control Child 2 to indi-
cate each DOF that was completed for the sec-
ond control child.
Write the time you begin each observation in
the box for Time of Day.
Write the date of the observation in the box for
Todays Date.
Check one box for Setting: Class or Recess.
You can also write in the predominant activity
during the observation session.
Record other information on Page 1 of each
DOF for the control child: childs gender, age,
ethnic group or race, and grade or level in
school. If you do not know the age of control
children, write the age of the identified child
as an estimate of age or leave blank. You will
not know the control childs birthdate, so leave
that blank. You or your supervisor can enter an
ID # on each DOF, if not done already, to indi-
cate which identified child is linked to each
control child. Each identified child should have
a unique ID # that is the same for all DOFs for
that child and any controls matched to the iden-
tified child. You or your supervisor can also
assign a label for Observation Set. See Page
4 of the DOF for instructions for completing
information on Page 1.
Completing Observations
Always thank the teacher after you complete
your observations. You do not need to inter-
rupt class to do this. For example, when you
are ready to leave, you can stand by the door
until you make eye contact with the teacher and
then mouth the words thank you.
Check Page 1 to make sure all information has
been completed on each DOF.
Return the completed DOF forms to your su-
per-visors or as soon as you have completed
all observations for a particular identified child
and any matched control children.
Return completed DOFs for control children
to your supervisor at the same time as you re-
turn DOFs for the identified child.
ASSESSING INTER-OBSERVER
AGREEMENT
As part of their training of DOF observers, su-
pervisors and researchers may want to assess in-
ter-observer agreement (IOA) on practice cases.
IOA refers to the extent to which two observers
agree on the occurrence and nonoccurrence of the
same behavior over the same observation period.
We recommend using the Percent Agreement In-
dex to calculate IOA (Hintze, 2005). This method
involves counting the total number of agreements
and dividing that by the total of agreements plus
disagreements. When computing the Percent
Agreement Index, it is important to consider IOA
separately for occurrences of a behavior and for
nonoccurrences of the same behavior in order not
to inflate the level of agreement. For example, sup-
pose over ten 1-minute intervals, two observers
recorded six occurrences of out-of-seat behavior
for the same intervals and two non-occurrences for
the same intervals, but they disagreed on occur-
rence versus nonoccurrence for two intervals. IOA
for occurrence agreements would be 6/6+2 = 6/8 =
.75 x 100 = 75%. IOA for nonoccurrences would
be 2/2+2 = 2/4 = .50 x 100 = 50%. However, if
both occurrences and nonoccurrences were in-
cluded in the same calculation, IOA would be 6+2/
6+2+2 = 8/10 = .80 x 100 = 80%.
To avoid inflating IOA, we compute IOA sepa-
rately for occurrences and nonoccurrences and then
compute the mean IOA to obtain a single index of
IOA. In the above example, mean IOA would be
(.75 + .50)/2 = .625 x 100 = 62.5%. The generally
accepted level of IOA for good agreement on dis-
creet behaviors is 80 to 90% (Hintze, 2005). So in
this example, the 62.5% IOA would indicate a need
for more training, and/or perhaps more refinement
of the definition of out-of-seat behavior.
IOA for DOF On-Task
On Page 2 of the DOF, observers rate the ob-
served child as being on-task or off-task dur-
ing the last 5 seconds of each 1-minute interval
over a 10-minute observation period. Figure 4-1
provides a worksheet that you can copy and use to
compute inter-observer agreement (IOA) for DOF
On-task ratings across five practice cases. Each ob-
server should complete one DOF for the same 10-
minute observation period for each practice case.
When the two observers both rate the child as on-
task in the same 1-minute interval, consider this
an Occurrence Agreement. When the two observ-
ers both rate the child as off-task in the same 1-
minute interval, consider this a Nonoccurrence
Agreement. When one observer rates the child as
on-task and the other observer rates the child as
off-task in the same 1-minute interval, consider
this a Disagreement. The five columns for prac-
tice cases in the worksheet provide spaces for re-
cording the number of Occurrence Agreements (O),
Nonoccurrence Agreements (N), and Disagree-
ments (D) across each 10-minute observation pe-
riod for each case. To do this, follow the instruc-
tions at the top of the worksheet:
Occurrence Agreements (O): Record the num-
ber of 1-minute intervals when both observers
rated the child as on-task for each case.
Nonoccurrence Agreements (N): Record the
number of 1-minute intervals when both observ-
ers rated the child as off-task for each case.
Disagreements (D): Record the number of 1-
minute intervals when one observer rated the
child as on-task and the other observer rated
the child as off-task for each case.
To compute IOA for On-task, follow the steps
shown in Figure 4-2.
Figure 4-3 shows an example of a completed
worksheet of IOA for DOF On-task based on five
practice cases observed by a pair of two observers

F
i
g
u
r
e

4
-
1
.

W
o
r
k
s
h
e
e
t

f
o
r

c
o
m
p
u
t
i
n
g

I
O
A

f
o
r

O
n
-
t
a
s
k
.
Figure 4-2. Steps for computing IOA for On-task.

F
i
g
u
r
e

4
-
3
.

E
x
a
m
p
l
e

o
f

I
O
A

f
o
r

O
n
-
t
a
s
k
.
(Nancy Jones and Valerie Stone). Each observer
completed one DOF for the same 10-minute ob-
servation of each practice case. The entries in the
columns under each practice case show the num-
ber of Occurrence Agreements (O), Nonoccurrence
Agreements (N), and Disagreements (D) for each
case. The last column shows the total O, N, and D
across all five cases: O = 28, N= 18 and D = 4. The
bottom of the worksheet shows the computations
of IOA using these data. IOA
O
for Occurrence
Agreements was 87.5%; IOA
N
for Nonoccurrence
Agreements was 81.8%; and Mean IOA was
84.5%, which is within generally accepted criteria
for agreement. This suggests that no further train-
ing would be necessary for On-task ratings by these
two observers.
IOA for DOF Problem Items
With some modification, the same procedures
can be used to determine IOA for the DOF prob-
lem items. This is a bit more complicated than for
On-task because the DOF contains multiple target
behaviors rated over a 10-minute observation pe-
riod. However, over such a relatively short period,
most observers are likely to rate only a few DOF
items as present. To compute IOA for the DOF
problem items, you must first dichotomize the 0-
1-2-3 DOF ratings into occurrences versus
nonoccurrences. To do this, consider ratings of 1,
2 or 3 as an occurrence of the problem and a rating
of 0 as a nonoccurrence. Then, determine the num-
ber of occurrences and nonoccurrences for each
problem for the same child rated by the two ob-
servers. Because many items are likely to be rated
0 by both observers, we recommend counting oc-
currences and nonoccurrences only for those items
that are scored present (i.e., rated 1, 2, or 3) by at
least one observer for at least one case. This will
avoid inflating IOA by including many
nonoccurrence agreements.
Figure 4-4 provides a worksheet that you can
copy and use to compute IOA for the DOF prob-
lem items rated present by two paired observers
for five practice cases. Each observer should com-
plete a separate DOF for the same 10-minute ob-
servation of each practice case. In the first column
of the worksheet, list the DOF problem items rated
1, 2, or 3 by at least one observer for each set of 2
DOFs per practice case. Then following the instruc-
tions at the top of the worksheet, use the letters O,
N, or D to note occurrences, nonoccurrences, and
disagreements between the two observers for each
item listed:
Record O when the two observers agreed on
the occurrence of an item for each case (i.e.,
both observers rated the item 1, 2, or 3); the ob-
servers do not have to agree on their numerical
rating.
Record N when the two observers agreed on
the nonoccurrence of an item for each case (i.e.,
both observers rated the item 0).
Record D when the two observers disagreed
on the occurrence or nonoccurrence of an item
for each case (i.e., one observer rated the item
0, and the other observer rated the item 1, 2, or
3).
To compute IOA for the DOF problem items,
follow the steps shown in Figure 4-5. In the last
three columns of the worksheet, enter the sum of
the O, N, D across all cases for each of the items
listed in the first column. At the bottom of the last
three columns, enter the totals for O, N, and D.
Figure 4-6 shows an example of a completed
worksheet for IOA for the DOF problem items for
the same five practice cases observed by the same
pair of observers in Figure 4-3 (Nancy Jones and
Valerie Stone). The entries in the columns under
each practice case show the Occurrence Agree-
ments (O), Nonoccurrence Agreements (N), and
Disagreements (D) on each item for that case. The
last three columns in the worksheet show the sums
of O, N, and D for each item across all five cases.
Totals for O, N, and D across items and cases are
entered at the bottom of each of the three columns:
O = 29, N = 15, and D = 6. The bottom of the
worksheet shows the computations of IOA using
these data. IOA
O
for Occurrence Agreements was
82.9%; IOA
N
for Nonoccurrence Agreements was
71.5%; and Mean IOA was 77.2%. The IOA
O
was

F
i
g
u
r
e

4
-
4
.

W
o
r
k
s
h
e
e
t

f
o
r

c
o
m
p
u
t
i
n
g

I
O
A

f
o
r

p
r
o
b
l
e
m

i
t
e
m
s
.
Figure 4-5. Steps for computing IOA for problem items.

F
i
g
u
r
e

4
-
6
.

E
x
a
m
p
l
e

o
f

I
O
A

f
o
r

p
r
o
b
l
e
m

i
t
e
m
s
.
Each observer should complete one DOF for
each case. Or observers can each complete two
DOFs per case based on two separate 10-minute
observations. If two DOFs are completed for each
case, then On-task and DOF problem item ratings
should be averaged across the two observation ses-
sions. The latter approach would require twice as
much time, but would have the advantage of ob-
taining more stable measures of On-task and prob-
lem behavior for each case.
To assess the consistency of DOF ratings across
cases, you would first obtain total On-task scores
and raw scores for each of the relevant DOF prob-
lem scales (for classroom observations: five syn-
dromes, DSM-oriented Attention Deficit/Hyperac-
tivity Problems scale and Inattention and Hyper-
activity-Impulsivity subscales, and Total Problems-
Classroom; for recess observations: Aggressive Be-
havior syndrome, and Total Problems-Recess).
Pearson correlations (r) can then be computed for
On-task scores and the DOF problem scale scores
across all non-practice cases. That is, instead of
examining IOA for occurrences or nonoccurrences
for On-task and problem item ratings, Pearson r
assesses the consistency of paired quantitative scale
scores obtained by two observers for the same set
of multiple cases.
Pearson r ranges from -1.00 to +1.00. A corre-
lation of .00 means that the two observers DOF
scores are unrelated. That is, the scores do not go
up or down together at all. A high positive correla-
tion means that the observers scores are consis-
tent within each other. That is, the two observers
tend to score the same cases high and the same
cases low. A high negative correlation means that
the two observers tend to score cases in the oppo-
site direction. That is, if one observer scores cer-
tain cases high, the other observer scores the same
cases low. For assessing inter-rater reliability of
assessment in struments and behavioral observa-
tions, Pearson rs in the .80s and .90s are generally
considered high, rs from .60 to .79 are considered
moderate, and rs below .60 are considered low
(Hintze, 2005; Sattler, 2008).
For rating scales like the DOF, it is important
to consider the nature of the phenomenon being
measured. Certain types of behaviors may produce
higher reliability than other types of behavior. For
within generally accepted criteria for agreement,
but IOA
N
and Mean IOA were slightly below ac-
ceptable levels. This suggests that additional train-
ing is necessary to improve agreement between the
two observers, especially for Nonoccurrences. For
example, the trainer could meet with the two ob-
servers to review the Guidelines for Rating DOF
Problem Items in Chapter 2 and discuss disagree-
ments on specific items, such as items 7, 33, 100,
and 101.
We recommend computing IOA on paired ob-
servations of practice cases as a way to assess the
adequacy of training for DOF observers. Some
trainers and researchers may also want to periodi-
cally check IOA with additional paired observa-
tions to determine whether observers continue to
follow the rating guidelines and thereby guard
against observer drift.
ASSESSING INTER-RATER
RELIABILITY
Chapter 7 presents our research on the reliabil-
ity of the DOF. These data are useful for evaluat-
ing the psychometric properties of the DOF. At the
same time, supervisors and researchers may also
want to assess the inter-rater reliability of their DOF
observers, which requires paired observations of
multiple cases after initial training on practice
cases. To do this, we recommend that paired ob-
servations be obtained on at least 15 cases, and
preferably 20 or more, if time allows. (These cases
can be randomly selected anonymous children in
classrooms or other group settings or participants
in research projects.)
56
This chapter discusses practical applications of
the DOF, in conjunction with other ASEBA forms.
In dealing with particular cases, skilled practitio-
ners apply their knowledge and procedures derived
from other cases to obtain a clear picture of a par-
ticular case. The ASEBA forms are designed to help
practitioners obtain a well-differentiated picture of
each case and to relate the findings to other cases.
Responsible practice requires practitioners to con-
tinually test their judgments against various kinds
of evidence. The ASEBA scoring profiles facili-
tate this process by enabling practitioners to com-
pare data for a particular child with data obtained
from normative samples of children of the same
gender and age range. The similar structure of the
various ASEBA forms and scoring profiles also
makes it easy to compare data from multiple per-
spectives.
The DOF provides a systematic way for observ-
ers to record and rate observations of children in
school classrooms, at recess, and in other group
settings. The DOF scoring profile provides a quan-
titative picture of observations, which practitioners
can examine to determine whether a child mani-
fested more problems in that setting than other
children of the same gender and age range. Be-
cause the DOF Profile is similar to other ASEBA
profiles, practitioners can easily compare DOF
scores with scores obtained from parents and teach-
ers, as well as scores based on observations of chil-
dren in test sessions and clinical interviews.
In this chapter, we discuss practical applications
of the DOF for use in school settings and mental
health services. We also discuss use of the DOF
for assessments of children with ADHD, emotional
and behavioral disorders, and learning problems.
Case examples illustrate how DOF data can be in-
tegrated with other ASEBA data to evaluate a
childs functioning and to plan interventions.
SEQUENCE FOR USING THE DOF AND
OTHER ASEBA FORMS
When adults seek services, they usually express
their reasons for wanting help. Children seldom
refer themselves for help. Instead they are referred
by concerned adults, such as parents, teachers,
guidance counselors, school psychologists, and
pediatricians. It is therefore important to obtain
information from other sources, as well as from
direct assessment of the child. Figure 5-1 illustrates
a typical sequence for using the DOF along with
other ASEBA forms for referral and intake, gath-
ering data, and interpreting data, as well as man-
aging cases and evaluating outcomes. The next
sections discuss each of these components of the
sequence.
Referral & Intake
Once a referral has been initiated, parents and
teachers can be asked to complete ASEBA forms
as part of initial data gathering at intake. For ex-
ample, parents or guardians can complete the
CBCL/6-18, and with parental consent, one or more
teachers can complete the TRF (for descriptions
of the ASEBA school-age forms, see Achenbach
& Rescorla, 2001). Whenever possible, relevant
medical, educational, and background information
should also be obtained during initial data gather-
ing. Scoring the ASEBA forms prior to observing
and testing the child and before interviewing par-
ents can help to identify areas of possible devi-
ance that can be explored further in subsequent data
gathering.
Direct Data Gathering
Chapter 5
Practical Applications and Case Examples
5. Practical Applications and Case Examples 57
5. Practical Applications and Case Examples
58
Children can be directly assessed via direct ob-
servations, tests, and/or interviews, as appropriate.
Many experts in child assessment emphasize the
importance of directly observing childrens behav-
ior in natural settings as part of the assessment pro-
cess (Barkley, 2006; Sattler & Hoge, 2006; Shapiro
& Kratochwill, 2000; Volpe & McConaughy,
2005). The DOF provides a standardized format
for doing this. School-based practitioners, such as
school psychologists, special educators, and guid-
ance counselors, can use the DOF to observe chil-
dren in their classrooms, at recess, or in other group
settings. Teacher aides and other school staff can
also be trained as independent DOF observers (see
Chapter 4). In addition, school-based practitioners
may use the DOF as part of a functional behav-
ioral assessment for problem-solving consultations
with teachers, as discussed in a later section.
Observing children in natural settings can be a
challenge for practitioners who are not based in
school settings. However, most child assessments
require information from school personnel. If a
mental health practitioner has an on-going relation-
ship with particular schools, he/she might train cer-
tain school personnel, such as teacher aides, to use
the DOF. Or the practitioner can collaborate with
the school psychologist or a special educator to
have DOFs completed. The practitioner can also
ask school staff to provide DOF observations along
with other referral data.
Many mental health and school-based evalua-
tions include interviewing the child and testing the
childs ability and/or academic achievement. The
SCICA (McConaughy & Achenbach, 2001) pro-
vides a standardized protocol for interviewing chil-
dren and has rating forms for scoring interview-
ers observations and childrens self reports dur-
ing the interview. When children are administered
ability and/or achievement tests, the TOF
(McConaughy & Achenbach, 2004) can be used
to rate observations of childrens behavior during
testing. To obtain unbiased observations of a childs
behavior, practitioners may want to use the DOF
to observe the child in the classroom or at recess
before interviewing or testing the child. Or they
can ask a trained independent observer to make
observations with the DOF.
Parents, teachers, and other relevant school staff
should be interviewed to obtain information that
is not accessible via rating scales and question-
naires. McConaughy (2005) discusses interview-
ing procedures in detail and provides reproducible
protocols for parent and teacher interviews.
In many mental health settings, it is customary
to make psychiatric diagnoses. The term diagno-
sis has a variety of meanings. In its narrow sense,
diagnosis is the medical term for classification
(Guze, 1978, p. 53). With respect to childrens be-
havioral and emotional problems, diagnosis in this
narrow sense refers to matching a childs problems
to diagnostic categories. The scoring profiles for
the DOF and other ASEBA forms provide DSM-
oriented scales that include problem items consis-
tent with diagnostic categories of the American
Psychiatric Associations DSM-IV and DSM-IV-
TR. High scores on the DSM-oriented scales can
alert practitioners to possible DSM diagnoses.
Structured diagnostic interviews with parents can
also provide diagnostic information (see
McConaughy, 2005).
Data Interpretation
The DOF Profile is modeled on similar profiles
for other ASEBA school-age forms, as described
in their respective manuals (Achenbach &
Rescorla, 2001; McConaughy & Achenbach, 2001,
2004). Chapter 3 provides instructions for com-
puter-scoring the DOF Profile. For classroom ob-
servations, the DOF Profile displays raw scores, T
scores, and percentiles for five DOF syndrome
scales, a DSM-oriented Attention Deficit/Hyper-
activity Problems scale and Inattention and Hyper-
activity/Impulsivity subscales, Total Problems, and
On-task. For recess observations, the DOF Profile
displays raw scores, T scores, and percentiles for
an Aggressive Behavior syndrome scale and Total
Problems.
59
To interpret the DOF and other ASEBA pro-
files, practitioners should examine scale scores to
identify a childs strengths and problems accord-
ing to each data source. The ASEBA profiles pro-
vide visual displays of scores to aid interpretation.
As discussed in Chapter 3, borderline and clinical
ranges indicate whether a childs scores on the DOF
scales are deviant relative to normative samples of
6- 11-year-old boys and girls (see Table 3-1 in
Chapter 3).
Choosing Cutpoints. Practitioners can decide
whether to use the borderline clinical or clinical
cutpoints on the ASEBA scales to classify children
as deviant versus nondeviant. Using cutpoints
at the bottom of the borderline range increases sen-
sitivity (i.e., the number of children needing help
or true positives classified as deviant), while re-
ducing the number of false negatives (the num-
ber of children needing help classified as
nondeviant). Using cutpoints at the bottom of the
clinical range, by contrast, increases specificity (i.e.,
the number of children not needing help or true
negatives classified as nondeviant), while reduc-
ing the number of false positives (the number of
children not needing help classified as deviant).
The borderline and clinical cutpoints provide more
flexibility than does a single cutpoint. For example,
when administering rating scales for screening pur-
poses, practitioners may want to maximize sensi-
tivity by using borderline cutpoints that will iden-
tify more children at risk for problems. These chil-
dren can then be referred for further in-depth as-
sessment. When assessing eligibility for special-
ized treatment programs or special education ser-
vices, practitioners may want to maximize speci-
ficity. They can then use the clinical cutpoints to
identify only the most clearly deviant children as
eligible for specialized treatment programs or spe-
cial educational placements.
Integrating Multisource Data. After interpret-
ing data from each source, practitioners must inte-
grate data across sources to identify consistencies
and inconsistencies in problem patterns across dif-
ferent situations and relationships. They can then
use such data to form hypotheses about consistent
versus inconsistent patterns of problems across
settings.
Because the DOF is part of the ASEBA, practi-
tioners can easily compare DOF item scores, scale
scores, and profile patterns with data from other
ASEBA forms relevant for 6-11-year-old children.
Comparing DOF scores with TOF scores may be
especially informative, since some children behave
differently in a structured test session versus the
more natural setting of the classroom or play-
ground. After interpreting and integrating data,
practitioners can consult with parents, teachers, and
other relevant persons to decide whether interven-
tions are warranted and, if so, what sorts of inter-
ventions would be appropriate and feasible.
Case Management & Outcome
Evaluation
Several different professionals may be involved
in designing and implementing interventions for
children. For example, in school settings, a school
psychologist may consult with a classroom teacher
to develop interventions for specific problems in
the classroom. Or a school multidisciplinary team
(MDT) may consult with teachers and parents to
develop an Individualized Education Program
(IEP) for a child who is eligible for special educa-
tion services. A mental health practitioner may of-
fer recommendations for interventions to parents,
teachers, or a treatment team. Or the same mental
health practitioner may first evaluate the child and
then provide treatment, such as individual therapy.
During the course of treatment, the child should
be reassessed to monitor progress and evaluate
changes in behavior.
Over short intervals (e.g., every week or 2
weeks), the DOF can be used to monitor progress
toward behavioral goals. Over longer intervals
(e.g., 2, 6, or 12 months), ASEBA forms, such as
the CBCL/6-18 and TRF, as well as the DOF, can
be used to monitor progress and evaluate outcomes.
If outcome evaluations include standardized test-
ing, practitioners can also complete the TOF. Time
1 and Time 2 scores on ASEBA scales can be com-
pared to help practitioners decide whether to
60
modify or terminate interventions.
SCHOOL-BASED ASSESSMENTS
Schools are especially important settings for
evaluating childrens cognitive and behavioral and
emotional functioning. Some children show more
problems in the school setting than in other con-
texts, such as home. For example, some children
with problems, such as ADHD, show co-occurring
behavioral, emotional, and learning problems at
school, but may show fewer problems at home.
Other children may show more problems or dif-
ferent patterns of problems at home or in other
contexts than school. The DOF can be especially
useful for documenting systematic observations of
childrens behavior in school and other group set-
tings.
Three-tiered Model and Response-to-Interven-
tion. The DOF can be especially useful for behav-
ioral assessment in three-tiered and Response-to-
Intervention (RTI) models for services in schools.
The three-tiered model moves from universal con-
ditions for all children (Tier 1), to targeted inter-
ventions of varying degrees of intensity for indi-
vidual children or groups of children (Tier 2), to
very intensive interventions for individual children
(Tier 3), as discussed by McConaughy and Ritter
(2008) and Tilly (2008). As a first step after a child
is referred, a school practitioner, such as a school
psychologist, would interview the referring teacher
to learn the teachers specific concerns about the
childs learning and behavior. During this interview,
the school psychologist would also learn what uni-
versal conditions (Tier 1) were already in place to
address behavioral and academic problems in the
childs classroom. (For example, are there clear
classroom rules and expectations for the behavior
of all children?)
If appropriate Tier 1 conditions are in place but
have proven ineffective with the referred child, then
the school psychologist would gather more spe-
cific data on the childs problems to develop ap-
propriate Tier 2 interventions. The DOF provides
a standardized format for obtaining direct obser-
vations of a childs behavior for Tier 2 assess-
ment. For example, the school psychologist could
use the DOF to obtain baseline observations of the
childs classroom behavior across several occa-
sions. By examining the DOF profile, the school
psychologist can learn which types of behavioral
problems are most severe in comparison to norms
for the childs peers. From DOF scales on which a
child obtains scores in the clinical or borderline
ranges, the school psychologist can choose spe-
cific problem behaviors (e.g. doesnt sit still, rest-
less; doesnt concentrate; disturbs other children)
to target for additional direct observations for a
functional behavioral assessment of the target be-
haviors.
The functional behavioral assessment would
require gathering more baseline data and identify-
ing antecedents and consequences of the specific
target behaviors. Parents and teachers can also be
interviewed and asked to complete the CBCL/6-
18 and TRF to provide baseline data on a childs
problem behaviors. The school psychologist and
teacher can use the baseline data to develop hy-
potheses about the functions of the childs prob-
lem behaviors (e.g., to gain attention or avoid aver-
sive tasks). They can then design interventions to
reduce selected target behaviors. For example, the
teacher might implement classroom accommoda-
tions and a positive behavioral support system
to reduce disruptive behavior and improve the
childs academic productivity. After interventions
are in place, the school psychologist (or other ap-
propriate school staff) would routinely monitor the
childs progress toward specific goals over short
intervals to determine whether the interventions are
producing desired results.
When Tier 1 and Tier 2 interventions are effec-
tively planned, delivered, and assessed for out-
come, and still prove to be ineffective for a child,
then a move to Tier 3 assessment is warranted. Tier
3 involves further assessment of the child and the
context of the problem behaviors, as well as the
reasons that previous interventions were not effec-
tive. Tier 3 also involves more intensive interven-
tion, coupled with more frequent monitoring of a
childs progress. For children with behavioral and
61
emotional problems, intensive interventions might
take the form of special education services or other
special programs in the local school, regional pro-
grams at the district level, referral for mental health
services, or placement in an intensive hospital-
based or residential program.
Eligibility for Special Education. Eligibility for
special education services must be determined ac-
cording to the rules and regulations of the Indi-
viduals with Disabilities Education Improvement
Act of 2004 (IDEA 2004; Public Law 108-446,
2004) and its subsequent reauthorizations. IDEA
2004 requires comprehensive assessment of the
nature, duration, severity, and patterning of a childs
problems, as well as assessment of environmental
circumstances and other factors that may precipi-
tate or maintain the problems. Regulations for com-
prehensive evaluations are stipulated by IDEA
2004, although each state has its own standards
and regulations for interpreting the federal law.
Special education evaluations typically include
collecting data from parents and teachers, along
with cognitive and achievement testing of the child.
The evaluation information is used to determine
whether the child meets criteria for one or more
disabilities defined by IDEA 2004, including spe-
cific learning disability; emotional disturbance;
speech or language impairment; autism (which can
include the spectrum of pervasive developmental
disorders); other chronic health impairment (which
can include ADHD), as well as other disabilities
involving sensory and orthopedic impairments and
traumatic brain injury. Children who qualify for
special education services under any of the above
categories may also exhibit behavioral and emo-
tional problems that can be assessed with the
ASEBA forms.
Section 504 Accommodations. In addition to
IDEA 2004, children with disabilities are protected
under Section 504 of the Rehabilitation Act of 1973
(Rehabilitation Act, 1973) and the Americans with
Disabilities Act of 1990 (ADA), which are civil
rights statutes. Section 504 and the ADA cover all
the disabilities defined by the IDEA, as well as
other disabilities that affect childrens functioning
in essential life areas. Children who are not eli-
gible for special education services may still qualify
for accommodations in the general education set-
ting under Section 504 and the ADA. Children with
ADHD, in particular, often qualify for Section 504
plans when they do not meet criteria for special
education.
ASSESSMENT OF ADHD
The DOF and other ASEBA forms are especially
useful for assessing ADHD in school settings and
mental health services. As a routine part of ADHD
assessments, parents of school-age children can
complete the CBCL/6-18 and teachers can com-
plete the TRF, along with other appropriate rating
scales and questionnaires. Interviews should also
be conducted with parents and teachers to assess
symptoms and functional impairment at home and
at school.
Direct observations of a childs behavior are
often recommended for evaluating ADHD, along
with parent and teacher reports (Barkley, 2006;
DuPaul & Stoner, 2003). Direct observations by
independent observers can be especially important
as external validators of symptoms reported by
parents and/or teachers. Particularly when there is
disagreement between parents and teachers regard-
ing symptom criteria, direct observations can add
essential information for or against a diagnosis of
ADHD. School psychologists and other school staff
can use the DOF to record and score observations
of a child in the classroom on several occasions.
Cognitive and achievement testing may also be
done to determine whether the child has cognitive
and/or academic deficits that interfere with school
functioning. Test examiners can then complete the
TOF to assess test session behavior.
Like other ASEBA forms, the DOF and TOF
Profiles include an Attention Problems syndrome
scale as well as a DSM-oriented Attention Deficit/
Hyperactivity Problems scale with Inattention and
Hyperactivity-Impulsivity subscales. If the various
ASEBA profiles consistently yield high scores on
62
the Attention Problems syndrome and/or on the
DSM-oriented Attention Deficit/Hyperactivity
Problems scale, this would provide quantitative
evidence to support an ADHD diagnosis. If differ-
ent informants ratings yield very different scores
on Attention Problems and/or the Attention Defi-
cit/Hyperactivity Problems scale, then practitioners
need to consider how environmental settings and
relationships may differ across informants. Infor-
mation from the ASEBA forms should also be in-
tegrated with data gathered from parent and teacher
interviews and other assessment sources to formu-
late an ADHD diagnosis. Once an ADHD diagno-
sis is confirmed, the school MDT can use the as-
sessment data to determine whether the child quali-
fies for special education services or a Section 504
plan. Medication and behavioral therapy may also
be warranted.
ASSESSMENT OF EMOTIONAL
DISTURBANCE
The IDEA 2004 definition of emotional distur-
bance is as follows:
(c) (4) (i) Emotional disturbance means a con-
dition exhibiting one or more of the follow-
ing characteristics over a long period of time
and to a marked degree that adversely affects
a childs educational performance:
(A) An inability to learn that cannot be ex-
plained by intellectual, sensory, or other
health factors;
(B) An inability to build or maintain satisfac-
tory interpersonal relationships with peers and
teachers;
(C) Inappropriate types of behavior or feel-
ings under normal circumstances;
(D) A general pervasive mood of unhappi-
ness or depression;
(E) A tendency to develop physical symptoms
or fears associated with personal or school
problems;
(ii) Emotional disturbance includes schizo-
phrenia. The term does not apply to children
who are socially maladjusted unless it is de-
termined that they have an emotional distur-
bance under paragraph (c) (4) (i) of this sec-
tion. (20 U.S.C. 1401 (3); 34 C.F.R 300.8 (c)
(4) (i)).
The above federal definition of emotional dis-
turbance includes five general characteristics, A
through E, that describe behavioral and emotional
problems. State regulations vary in their interpre-
tations of the five characteristics. Some states in-
clude externalizing problems (e.g., aggressive be-
havior, conduct disorder), along with internalizing
problems, while other states try to exclude exter-
nalizing problems. However, research has shown
that externalizing and internalizing problems of-
ten co-occur (McConaughy & Skiba, 1993).
All three qualifying conditions listed in para-
graph (c) (4) (i) must apply to at least one of the
identified five characteristics (A to E) of emo-
tional disturbance. That is, the characteristic(s)
must exist over a long period of time, to a marked
degree, and adversely affect educational perfor-
mance. A child who exhibits at least 1 of the 5 char-
acteristics, or has a diagnosis of schizophrenia, and
meets all three qualifying conditions, is judged to
have emotional disturbance. A child who does not
meet criteria for emotional disturbance is deemed
to be ineligible for special education on the basis
of that category.
To facilitate special education evaluations, Table
5-1 outlines relations between the IDEA 2004 cri-
teria for emotional disturbance and the empirically
based syndromes of the DOF, along with the
CBCL/6-18, TRF, YSR, TOF, and SCICA. The
table lists the syndrome and DSM-oriented scales
of each instrument next to the characteristic(s) that
they most clearly reflect. The table also shows how
scores and scales of the various instruments can
provide evidence that characteristics have existed
for a long period of time, to a marked degree, and
adversely affect educational performance (for fur-
ther discussion of ASEBA applications to criteria
for emotional disturbance, see McConaughy &
Achenbach, 2001; McConaughy & Achenbach,
2004; McConaughy & Ritter, 2008). If a child
shows deviance on scales relevant to the criteria
63
64
a
A
t
t
e
n
t
i
o
n

P
r
o
b
l
e
m
s
,

D
S
M
-
o
r
i
e
n
t
e
d

A
t
t
e
n
t
i
o
n

D
e
f
i
c
i
t
/
H
y
p
e
r
a
c
t
i
v
i
t
y

P
r
o
b
l
e
m
s
,

I
m
m
a
t
u
r
e
/
W
i
t
h
d
r
a
w
n
,

I
n
t
r
u
s
i
v
e
,

O
p
p
o
s
i
t
i
o
n
a
l
,

a
n
d
S
l
u
g
g
i
s
h

C
o
g
n
i
t
i
v
e

T
e
m
p
o

a
r
e

s
c
o
r
e
d

f
r
o
m

c
l
a
s
s
r
o
o
m

o
b
s
e
r
v
a
t
i
o
n
s
;

A
g
g
r
e
s
s
i
v
e

B
e
h
a
v
i
o
r

i
s

s
c
o
r
e
d

f
r
o
m

r
e
c
e
s
s

o
b
s
e
r
v
a
t
i
o
n
s
.
b
A
t
t
e
n
t
i
o
n

P
r
o
b
l
e
m
s
,

L
a
n
g
u
a
g
e
/
M
o
t
o
r

P
r
o
b
l
e
m
s
,

W
i
t
h
d
r
a
w
n
/
D
e
p
r
e
s
s
e
d
,

S
e
l
f
-
C
o
n
t
r
o
l

P
r
o
b
l
e
m
s
,

a
n
d

A
n
x
i
o
u
s

a
r
e

s
c
o
r
e
d

f
r
o
m

i
n
t
e
r
-
v
i
e
w
e
r
s

o
b
s
e
r
v
a
t
i
o
n
s
;

A
g
g
r
e
s
s
i
v
e
/
R
u
l
e
-
B
r
e
a
k
i
n
g
,

A
n
x
i
o
u
s
/
D
e
p
r
e
s
s
e
d
,

a
n
d

S
o
m
a
t
i
c

C
o
m
p
l
a
i
n
t
s

a
r
e

s
c
o
r
e
d

f
r
o
m

c
h
i
l
d
r
e
n
s

s
e
l
f
-
r
e
p
o
r
t
s

d
u
r
i
n
g

t
h
e

S
C
I
C
A
.

T
h
e

D
S
M
-
o
r
i
e
n
t
e
d

s
c
a
l
e
s

o
n

t
h
e

S
C
I
C
A

a
r
e

s
c
o
r
e
d

f
r
o
m

i
n
t
e
r
v
i
e
w
e
r
s

o
b
s
e
r
v
a
t
i
o
n
s

a
n
d

c
h
i
l
d
r
e
n
s

s
e
l
f
-
r
e
p
o
r
t
s
.
65
Practitioners can examine the DOF results, along
with cognitive and achievement test data, to plan
appropriate interventions and classroom accommo-
dations to address co-occurring behavioral and
emotional problems along with academic deficits
of children with learning disabilities.
CASE EXAMPLE OF ASSESSMENT
OF ADHD:
Melinda Brandt, Age 8
Melinda Brandt is the 8-year-old girl whose
computer-scored DOF Profile was shown in Chap-
ter 3. Melinda was the younger of two children in
a middle class family that included her mother,
father, and older brother. Her mother brought her
to a mental health outpatient clinic for a psycho-
logical evaluation because she was concerned about
Melindas problems paying attention and her
struggles with school work. Melindas teacher had
sent home several notes complaining about
Melindas behavior in school and her failure to
complete work on time. Melindas mother was es-
pecially worried that Melinda might be retained in
third grade, which she felt would be a great blow
to her self-esteem.
Melinda was evaluated by a child psychologist
in the mental health clinic who also provided con-
tracted consultation services in Melindas school
district. Melindas evaluation followed the se-
quence illustrated in Figure 5-1 and included the
five assessment axes outlined earlier in Table 1-1
in Chapter 1. Prior to Melindas appointment at
the clinic, Ms. Brandt completed the CBCL/6-18
and a questionnaire about Melindas developmen-
tal and medical history. With Ms. Brandts permis-
sion, Melindas third grade teacher completed the
TRF and provided copies of Melindas school
records. Ms. Brandt also gave permission for the
clinic psychologist to interview Melindas teacher
and to obtain observations of Melindas behavior
in the classroom.
As part of her consultation services to the school
district, the psychologist had trained teacher aides
in procedures for using the DOF. One week before
Melindas evaluation at the clinic, a teacher aide
(Valerie Stone) used the DOF to obtain four 10-
minute observations of Melinda in her classroom.
Ms. Stone also made two 10-minute observations
of each of two of Melindas classmates on the same
days that she observed Melinda. After she com-
pleted all the observations, Ms. Stone mailed the 8
DOFs to the clinic psychologist for computer-scor-
ing.
For Melindas evaluation at the clinic, the psy-
chologist administered the Wechsler Intelligence
Scales for Children-Fourth Edition (WISC-IV;
Wechsler, 2003) and the Wechsler Individual
Achievement Test-Second Edition (WIAT-II;
Wechsler, 2002) to assess her cognitive and aca-
demic functioning. She also administered a com-
puterized continuous performance test (CPT) to
assess Melindas impulsivity and ability to sustain
attention. After each test, the psychologist com-
pleted the TOF to provide a standardized assess-
ment of Melindas test session behavior. The psy-
chologist also interviewed Ms. Brandt about
Melindas developmental and educational history
and her behavior at home, and interviewed
Melindas teacher on the phone.
Parent and Teacher Reports. The CBCL/6-18
completed by Ms. Brandt produced scores in the
borderline clinical range for Externalizing (84
th
percentile) and Attention Problems (95
th
percen-
tile), but normal range scores for all other scales.
Melindas scores on the CBCL/6-18 competence
scales were also in the normal range, although her
mother expressed worries about her school perfor-
mance. In a structured diagnostic interview, Ms.
Brandt endorsed 6 of 9 DSM-IV-TR ADHD symp-
toms of inattention, with onset before age 7, but
no symptoms of hyperactivity-impulsivity. Al-
though Ms. Brandt acknowledged that Melinda
sometimes seemed restless (e.g., had trouble sit-
ting still at dinner and in church) and she did not
always think things through, Ms. Brandt did
not think that Melinda was unusually hyperac-
tive compared to other children in the family.
Ms. Brandts main concern was that Melindas
66
attention problems were interfering with her abil-
ity to do schoolwork and that she was falling be-
hind in class. Ms. Brandt said that Melinda needed
constant reminders to do her homework and some-
times failed to hand in work even when she did
complete it. Melindas struggles with schoolwork
often led to arguments and temper tantrums at
home. Ms. Brandt had become especially alarmed
when Melindas teacher suggested that Melinda
might not be ready to move on to fourth grade at
the end of the year.
The TRF completed by Melindas third grade
teacher produced scores in the clinical range for
Externalizing and Total Problems (above 90
th
per-
centile), plus clinical range scores on the TRF At-
tention Problems syndrome scale, the DSM-ori-
scale, and the Inattention and Hyperactivity-Impul-
sivity subscales (all above 97
th
percentile).
Melindas scores were in the borderline clinical
range on the TRF Social Problems, Thought Prob-
lems, Rule-Breaking and Aggressive Behavior syn-
drome scales, as well as the DSM-oriented Oppo-
sitional Defiant and Conduct Problems scales.
The teachers ratings of Melindas adaptive
functioning yielded a score in the clinical range
below the 10
th
percentile. The teacher rated Melinda
as behaving much less appropriately, learning much
less, and somewhat less happy than typical pupils.
The teacher also rated Melindas academic perfor-
mance as far below grade level in mathematics,
written language, and social studies, somewhat
below grade level in reading, but at grade level in
art. The teacher noted that Melinda was a capable
and creative child, but she had great difficulty sit-
ting still, seemed to talk constantly, and frequently
disturbed other children. Her school work was of-
ten messy and incomplete. She failed to listen to
instructions and seemed unconcerned about the
quality of her work. The teacher felt it was very
challenging to have Melinda in her class. The
teacher had tried accommodations to address
Melindas attention problems and disruptive be-
havior (e.g., moving her to a quiet corner in the
class and providing stickers for completed work),
but nothing seemed to work. The teacher confirmed
that school staff were considering retention in third
grade due to Melindas poor academic and social
functioning.
Classroom Observations with the DOF. Fig-
ure 2-2 in Chapter 2 showed the observers notes
and on-task ratings on the DOF for the first 10-
minute observation of Melinda. Figures 3-1 to 3-4
in Chapter 3 displayed Melindas computer-scored
DOF Profile. Melinda scored in the clinical range
above the 97
th
percentile on the Attention Prob-
lems, Intrusive, and Oppositional syndrome scales,
and in the borderline clinical range on the Slug-
gish Cognitive Tempo syndrome scale (see Figure
3-1). These high scores indicated that Melinda ex-
hibited many more attention problems and more
intrusive and oppositional behavior in the class-
room than was typical for the DOF normative
sample of 6-11-year-old girls. Melinda also scored
in the clinical range on the DSM-oriented Atten-
tion Deficit/Hyperactivity Problems scale and the
Hyperactivity/Impulsivity subscale, and in the bor-
derline range on the Inattention subscale (see Fig-
ure 3-3).
Test Scores and Observations with the TOF.
On the WISC-IV, Melinda obtained a full scale IQ
of 107, which was in the average range. She scored
in the average range for Verbal Comprehension
(VCI = 108), Perceptual Reasoning (PRI = 106),
and Working Memory (WMI = 107), but low aver-
age for Processing Speed (PSI = 88). On the WIAT-
II, Melinda scored in the average range for reading
and mathematics, but low average for written ex-
pression. On the math subtests, she scored much
lower for numerical operations than for math rea-
soning. Her scores on the CPT also suggested ten-
dencies toward impulsive responding and difficul-
ties sustaining attention.
The psychologists ratings of Melindas test ses-
sion behavior produced scores in the borderline
range on the TOF Attention Problems syndrome
and the DSM-oriented Attention Deficit/Hyperac-
tivity Problems scale, as well as borderline scores
on the Inattention and Hyperactivity/Impulsivity
67
subscales. Melinda also scored in the borderline
range on the TOF Oppositional syndrome during
achievement testing, but not during cognitive test-
ing.
Data Interpretation and Integration. Class-
room observations with the DOF were especially
useful in Melindas case in light of discrepancies
between reports by her mother versus her teacher.
The DOF Profile showed that Melinda exhibited
many more attention problems than was typical for
the normative sample of 6-11-year-old girls. She
also exhibited many more attention problems than
two classmates selected as DOF controls. These
DOF findings corroborated reports of inattention
by Melindas mother and her teacher. At the same
time, the DOF also showed high levels of hyper-
activity and impulsivity in the classroom, consis-
tent with reports by Melindas teacher but not her
mother. The TOF Profile also indicated high lev-
els of inattention and hyperactivity/impulsivity dur-
ing cognitive and achievement testing, but at less
severe levels than observed in the classroom with
the DOF.
Taken together, the DOF and TOF provided im-
portant independent evidence of problems with in-
attention and hyperactivity/impulsivity, as reported
by Melindas teacher on the TRF. The results from
the DOF, TOF, and TRF, in conjunction with de-
velopmental and educational history, supported a
DSM-IV-TR diagnosis of ADHD-Combined type.
However, if the evaluation had relied only on symp-
tom reports by Melindas mother, without class-
room observations that corroborated reports by her
teacher, a diagnosis of ADHD-Combined type
would not have been appropriate.
Melindas average scores on the WISC-IV sug-
gested that her problems with schoolwork were not
due to low ability. However, low average WIAT-II
scores for math operations and written expression
indicated that Melinda was falling behind in these
basic academic skills. Interestingly, the TOF Pro-
file showed severe oppositional behavior during
achievement testing, but not during cognitive test-
ing. The DOF also revealed severe oppositional
and intrusive behavior in the classroom, consis-
tent with the TOF and the teachers reports on the
TRF. Although Melindas mother did not report
oppositional behavior on the CBCL/6-18, she did
say that attempts to help Melinda with her home-
work often erupted into arguments and temper tan-
trums at home. Taken together, these findings sug-
gested a strong association between Melindas aca-
demic skill deficits and her oppositional behavior
when confronted with academic tasks.
Case Management and Outcome Evaluation.
To address problems revealed in the evaluation,
the psychologist referred Melinda and her parents
to a child psychiatrist in the clinic for possible
medication for ADHD. The psychologist also con-
sulted with Melindas teacher to develop accom-
modations and behavioral interventions in the
classroom. They moved Melindas seat near the
teachers desk for closer monitoring of her work.
They also paired Melinda with a peer tutor to work
on math and writing assignments and created an
incentive plan to encourage on-task behavior and
academic productivity. The school staff incorpo-
rated the accommodations and behavioral interven-
tions into a Section 504 plan for Melinda as an
alternative to retention in third grade. As part of
the Section 504 plan, a teachers aide continued to
conduct biweekly classroom observations of
Melinda with the DOF. Following an RTI model,
the school team examined DOF scores for On-task,
Attention Problems, and the Attention Deficit/Hy-
peractivity Problems scale to monitor Melindas
progress toward their behavioral goals. They also
used curriculum-based measures to monitor
Melindas academic progress in math and written
work.
CASE EXAMPLE OF A SCHOOL-BASED
ASSESSMENT OF BEHAVIOR PROBLEMS:
Ricky Johnson, Age 9
Ricky Johnson was the youngest child living
with his mother and two sisters in a low-income
inner city neighborhood. Rickys fourth grade
teacher consulted the school MDT because he had
been involved in several fights on the playground.
68
The teacher was not sure what started the fights or
who else was involved, but Ricky was usually the
one sent to the principals office for in-school sus-
pensions. The teacher also reported that Ricky was
disruptive in class and seemed to have few friends.
The school psychologist (Harry Provo) contacted
Rickys mother to express the teams concerns and
to obtain her permission for a behavioral assess-
ment. Ms. Johnson agreed to have the school psy-
chologist observe Rickys behavior in school. She
also agreed to complete the CBCL/6-18 and to have
Rickys teacher complete the TRF.
Recess and Classroom Observations with the
DOF. As an initial step in his evaluation, Mr. Provo
used the DOF to observe Ricky in his classroom
and during recess. On each of three days, Mr. Provo
conducted two observations of Ricky in the class-
room and two observations on the playground dur-
ing recess. Mr. Provo also observed two other boys
as control children in the same setting.
Figure 3-6 in Chapter 3 showed the DOF Pro-
file scored from Mr. Provos six observations of
Ricky during recess. On the DOF Profile, Ricky
scored in the clinical range above the 97
th
percen-
tile on the Aggressive Behavior syndrome and in
the clinical range above the 90
th
percentile for To-
tal Problems. These results indicated that Ricky
exhibited many more problems at recess than was
typical for the DOF normative sample of 6-11-year-
old boys. On the Aggressive Behavior syndrome,
Mr. Provo rated six items as present for Ricky: 14.
Cruel, bullies, or mean to others; 30.Gets into
physical fights; 31. Gets teased; 47. Screams; 66.
Teases; and 86. Bossy. Ricky also exhibited six
other problems that contributed to his high DOF
Total Problems score: 3. Argues; 8. Difficulty wait-
ing turn in activities or tasks; 20. Disobedient; 22.
Doesnt seem to feel guilty after misbehaving; 67.
Temper tantrums, hot temper, or seems angry; and
83. Doesnt get along with peers. The two control
children, by contrast, showed little aggressive be-
havior. However, their borderline clinical score for
Total Problems indicated that they too showed other
problems on the playground, most notably scream-
ing and difficulty waiting their turn, similar to
Ricky.
The DOF Profile scored from Mr. Provos ob-
servations of Ricky in the classroom (not shown)
produced a score at the 84
th
percentile for Total
Problems, which fell in the borderline range for 6-
11-year-old boys. Ricky scored in the borderline
range between the 93
rd
and 97
th
percentiles on the
DOF Intrusive and Oppositional syndrome scales,
but in the normal range on the Sluggish Cognitive
Tempo, Immature/Withdrawn, and Attention Prob-
lems syndrome scales and the DSM-oriented At-
tention Deficit/Hyperactivity Problems scale. On
the Intrusive syndrome, Mr. Provo rated five items
as present for Ricky: 8. Difficulty waiting turn in
activities or tasks; 21. Disturbs other children; 46.
Disrupts group activities; 55. Demands must be
met immediately; and 65. Talks too much. On the
Oppositional syndrome, Mr. Provo rated four items
as present: 16. Difficulty following directions; 23.
Doesnt seem to listen to what is being said; 52.
Shows off, clowns, or acts silly; and 83. Doesnt
get along with peers. Although the two control
children also showed difficulty waiting their turn
and talking too much, their scores on all DOF scales
were within the normal range.
Parent and Teacher Reports. The CBCL/6-18
completed by Ms. Johnson yielded a score in the
clinical range above the 90
th
percentile for Exter-
nalizing, along with a borderline score on the Rule-
Breaking Behavior syndrome (95
th
percentile).
Ricky scored just below the borderline range on
the Social Problems syndrome (90
th
percentile) and
Aggressive Behavior syndrome (92
nd
percentile),
but well within the normal range on all other syn-
drome scales. Rickys total competence score on
the CBCL/6-18 was in the clinical range below the
10
th
percentile, with clinical range scores below the
3
rd
percentile on the Social and School scales.
In a phone interview with Mr. Provo, Ms.
Johnson said that she was worried that Ricky was
hanging out with older boys who had gotten into
trouble in the neighborhood. She said that the po-
lice had come to her house a month ago because
some of the boys were caught shoplifting at a local
69
grocery store. Ms. Johnson believed that Ricky was
innocent, but she worried that he might be led on
by the other boys. Because Ms. Johnson worked
two jobs to support her family, she was not able to
provide the supervision at home that she felt Ricky
needed. She also reported that Ricky often had
trouble with his schoolwork. She asked his older
sisters to help him with homework, but they were
often too busy with their own work or socializing
with friends.
The TRF completed by Rickys fourth grade
teacher yielded clinical range scores above the 90
th
percentile for Externalizing and Total Problems,
along with a clinical range score above the 97
th
percentile on the Social Problems syndrome. Ricky
scored near the borderline range on the Rule-Break-
ing Behavior and Aggressive Behavior syndromes.
The teacher also reported several problems on the
Attention Problems syndrome (e.g., 4. Fails to fin-
ish things he/she starts; 22. Difficulty following
directions; 92. Underachieving, not working up to
potential), but Rickys total score was in the nor-
mal range. Rickys scores on the TRF adaptive
functioning scale was in the clinical range below
the 10
th
percentile. The teacher rated Ricky as be-
having much less appropriately and learning much
less than typical pupils. She also rated Rickys aca-
demic performance as far below grade level in read-
ing, mathematics, and written language, but at
grade level in social studies and science. Rickys
teacher was especially concerned about his prob-
lems getting along with other children on the play-
ground and his disruptive behavior in class. She
was also concerned that in-school suspensions had
caused Ricky to miss instructional time to the point
that he was falling behind in his schoolwork. When
Ricky was in class, he often clowned around and
disrupted class activities. He also became easily
frustrated with his work, sometimes to the point of
ripping up his own papers.
Data Interpretation and Integration. Results
from the DOF, TRF, and CBCL/6-18 indicated that
Ricky was showing more externalizing problems
than most boys his age. All three ASEBA forms
produced scores above the 90
th
percentile on the
Aggressive Behavior syndrome. The CBCL/6-18
and TRF also produced scores above the 90
th
per-
centile on the Rule-Breaking Behavior syndrome.
Low scores on the CBCL/6-18 Social competence
scale and high scores on the TRF Social Problems
syndrome also indicated problems in social rela-
tionships.
Following the three-tiered model discussed ear-
lier, the school MDT decided to initiate Tier 2 be-
havioral interventions to address Rickys problems
in school. As a first step, Mr. Provo examined the
profiles from the ASEBA forms to identify prob-
lems that were consistent across informants and
settings. He listed several problems from the DOF
recess observations that were similar to problems
reported on the CBCL/6-18 and TRF: 3. Argues;
20. Disobedient; 22. Doesnt seem to feel guilty
after misbehaving; 30. Gets into physical fights;
31. Gets teased; 66. Teases; 67. Temper tantrums,
hot temper, or seems angry; and 83. Doesnt get
along with peers.
Mr. Provo conducted a functional behavior as-
sessment to identify antecedents and consequences
of the problem behaviors, particularly fighting on
the playground. He learned that fights usually
erupted after other children teased Ricky and called
him names or after Ricky argued with them, teased
them, or became bossy about the rules of a game.
As the arguing and teasing escalated, Ricky would
lose his temper and start hitting and punching. The
other children also lost their tempers and began
hitting and punching, so that it was not always clear
who started the fights. On one occasion, the play-
ground supervisor broke up a fight and sent Ricky
to the principals office. On another occasion, no
one intervened and the fight ended when the bell
rang for children to return to class.
Mr. Provo conducted an additional functional
behavior assessment to identify antecedents and
consequences of problems observed in the class-
room, particularly 21. Disturbs other children; 46.
Disrupts group activities; and 52. Shows off,
clowns, or acts silly. Mr. Provo learned that Ricky
became easily frustrated when academic tasks were
too difficult. He would then start clowning and
5. Practical Applications and Case Examples 70
acting silly or would disturb other children to avoid
doing his work. When this happened, the teacher
scolded Ricky and made him sit alone in the back
of the room for a time-out.
Mr. Provo met with Rickys teacher and Ms.
Johnson to discuss his observations and learn more
about their perspectives on Rickys problems. Ms.
Johnson and the teacher agreed that Ricky had dif-
ficulty controlling his temper and that he lacked
social skills for getting along with other children
his age. They also worried that Ricky might be
learning delinquent behaviors from the older
boys in his neighborhood. Mr. Provo noted that
teasing and name-calling seemed to be what
sparked Rickys fights on the playground. He also
pointed out that a desire to avoid difficult school
work might explain Rickys disruptive behavior in
the classroom.
Mr. Provo consulted with Ms. Johnson and
Rickys teacher to develop Tier 2 interventions to
address Rickys problems. They delineated a clearer
set of playground rules for all children (e.g., no
hitting and fighting, no name calling) and increased
the level of supervision on the playground. They
developed a behavior contract for Ricky to encour-
age positive behaviors (e.g., working quietly, ask-
ing for help when needed, and working coopera-
tively with other children) that would replace un-
desirable behaviors in the classroom. The behav-
ior contract also included bonus points for days
when Ricky did not get into fights on the play-
ground. The teacher sent weekly reports home to
Ms. Johnson showing Rickys progress toward the
behavioral goals. When Ricky met his weekly
goals, Ms. Johnson provided special rewards at
home (e.g., watching a DVD, getting a special din-
ner, playing a board game).
To improve his social functioning, Ricky was
enrolled in a weekly social skills group conducted
by the school guidance counselor. The teacher and
the guidance counselor also began teaching a so-
cial skills curriculum to the entire class. Ms.
Johnson enrolled Ricky in an after-school recre-
ational and sports program, which introduced him
to a new peer group and increased adult supervi-
sion in after-school hours. To improve his academic
skills, Ricky received daily small group instruc-
tion in reading and math and his teacher checked
his work regularly to ensure that he understood
directions and was staying on task. Ricky received
additional points on his behavior contract for meet-
ing academic goals (e.g., handing in assignments
on time).
Case Management and Outcome Evaluation.
Mr. Provo continued to use the DOF to monitor
Rickys progress in meeting behavioral goals. To
do this, he trained a teacher aide in the DOF rating
procedures. The teacher aide used the DOF to make
two 10-minute observations of Ricky in the class-
room and two 10-minute observations at recess
each week over a period of 10 weeks. Mr. Provo
scored each set of weekly DOFs and created graphs
of Rickys scores for DOF On-task and the Intru-
sive, Oppositional, and Aggressive Behavior syn-
drome scales. Mr. Provo and the teacher examined
the graphs each week to evaluate Rickys progress.
The graphs of DOF scores showed a gradual
decline in the DOF syndrome scale scores and an
increase in On-task scores over the 10-week inter-
vention period. At the end of the 10 weeks, Rickys
teacher completed a new TRF and Ms. Johnson
completed a new CBCL/6-18 as additional mea-
sures of outcomes for Ricky. The CBCL/6-18 and
TRF both showed lower scores on the Social Prob-
lems, Rule-Breaking and Aggressive Behavior syn-
drome scales than at baseline. These results, com-
bined with the DOF findings, suggested that the
school-based interventions were associated with
reductions in targeted problem behaviors and an
increase in on-task behavior. Although the MDT
could not attribute direct causal effects to the in-
terventions, they felt that the changes in ASEBA
scores justified continuing the Tier 2 interventions
until the end of the school year.
If the Tier 2 interventions had not been associ-
ated with reductions in Rickys problems, then the
MDT would have proceeded with a more compre-
hensive Tier 3 evaluation to determine whether
Ricky was eligible for special education services
under the category of emotional disturbance. The
MDT could use the existing CBCL/6-18, TRF, and
71
In earlier chapters, we described the 2009 DOF
and the DOF Profile. In this chapter, we summa-
rize research to develop previous versions of the
DOF and its scoring profile. We then describe our
research to develop and standardize the scales of
the 2009 DOF Profile. Readers who are not inter-
ested in the details of form and scale construction
should feel free to skim or skip this chapter.
EARLIER VERSIONS OF THE DOF
The original version of the DOF (Achenbach,
1981) included 96 problem items, plus an open-
ended item for other problems, and on-task ratings
for ten 1-minute intervals. To assemble the DOF
item set, Achenbach examined early versions of
the CBCL and TRF to find items appropriate for
rating direct observations of childrens behavior
in group settings. (For brevity, we use CBCL here
to refer all versions.) Whenever possible,
Achenbach retained the original wording of the
CBCL and TRF items for DOF items. Examples
are 2009 DOF items: 5. Defiant or talks back to
staff; 19. Destroys property belonging to others;
41. Physically attacks people; and 71. Unhappy,
sad, or depressed. Other CBCL/TRF items were
worded slightly differently on the DOF to make
them more appropriate for direct observations of
childrens behavior. Examples are: rewording
CBCL/TRF item 8. Cant concentrate, cant pay
attention for long to DOF item 7. Doesnt concen-
trate or doesnt pay attention for long; rewording
CBCL/TRF item 10. Cant sit still, restless, or
hyperactive to DOF item 9. Doesnt sit still, rest-
less, or hyperactive; and rewording CBCL/TRF
item 19. Demands a lot of attention to DOF item
17. Demands or tries to get attention of staff (later
abbreviated to 17. Tries to get attention of staff).
Seventy-two of the 96 original DOF items had
counterparts on the CBCL and 86 had counterparts
on the TRF. Ten of the 96 original DOF items had
no direct counterpart on the CBCL and/or TRF. Ex-
amples are 2009 DOF items: 81. Easily led by
peers; 85. Tattles; and 86. Bossy (for details
of the 1981 DOF, see Achenbach & Edelbrock,
1983).
A revised edition of the DOF (Achenbach, 1986)
included the same 96 problem items, plus an open-
ended item for other problems. Some items were
reworded slightly for clarification. An example is
1981 DOF item 34. Isolates self from others that
was reworded to 34. Physically isolates self from
others.
The 1986 DOF Profile displayed scores for six
syndrome scales derived from principal compo-
nents/varimax analyses of 212 clinically referred
5-to-14-year-old children: Withdrawn-Inattentive,
Nervous-Obsessive, Depressed, Hyperactive, At-
tention Demanding, and Aggressive. The profile
also displayed scores for Internalizing and Exter-
nalizing scales derived from second-order factor
analyses of scores on the six syndrome scales, plus
a Total Problems score (the sum of ratings on all
96 items plus other problems) and an On-task
score. The 1986 DOF problem scales were normed
on 287 children observed as controls for referred
children in regular classrooms of 45 schools in
Vermont, Nebraska, and Oregon. The 1986 DOF
computer-scoring program calculated raw scores
for the syndrome scales, plus raw scores and T
scores for Internalizing, Externalizing, and Total
Problems, averaged across observation sessions
separately for the identified child and matched con-
trols. The 1986 DOF profile also provided an On-
task score ranging from 1 to 10, averaged across
observation sessions for identified and control chil-
dren.
Constructing the DOF and DOF Profile
Chapter 6
6. Constructing the DOF and DOF Profile 72
Several studies have reported data on the reli-
ability, stability, and validity of the 1981 and 1986
versions of the DOF. Achenbach and Edelbrock
(1983) reported inter-rater reliabilities (Pearson r)
of .96 for DOF Total Problems and .71 for On-task
for 16 children observed in a residential treatment
setting by two trained research assistants. In the
same residential setting, 6-month stability Pearson
correlations for 36 children were .55 for classroom
observations, .59 for recess observations, and .51
for classroom On-task scores. Scores for each child
were averaged over six 10-minute observations at
Time 1 and Time 2. These stability coefficients for
the DOF were similar to the 6-month stability co-
efficient of .57 for CBCL ratings of the same chil-
dren by their mother or a child care worker.
For a sample of 25 public school boys referred
for special services for behavioral problems, Reed
and Edelbrock (1983) reported inter-rater
reliabilities of .91 for DOF Total Problems and .83
for On-task ratings summed across six 10-minute
observations in a 60-minute observation period.
They also reported mean inter-rater reliabilities of
.85 for Total Problems and .71 for On-task for the
same sample when rs for each of the 6 sets of 10-
minute observations were averaged across sessions.
For a sample of 62 randomly selected 6-11-year-
old boys, McConaughy, Achenbach, and Gent
(1988) reported inter-rater reliabilities of .75 for
DOF Total Problems and .88 for On-task scores
averaged across one 10-minute classroom obser-
vation and one 10-minute recess observation for
each child. As part of a school-based prevention
study, McConaughy, Kay, and Fitzgerald (1998,
1999) reported reliabilities for five pairs of trained
DOF raters. Each rater pair observed 20 randomly
selected elementary school-aged children in class-
rooms for one 10-minute period. Inter-rater
reliabilities averaged across rater pairs were .86
for DOF Total Problems and .90 for On-task.
Achenbach and Rescorla (2001) reported mean
inter-rater reliabilities of .90 for DOF Total Prob-
lems and .84 for On-task scores (after Fisher z trans-
formations), averaged across the studies of the 1981
and 1986 DOF discussed above.
As evidence for the validity of the DOF, Reed
and Edelbrock (1983) reported that DOF Total
Problems and On-task scores discriminated signifi-
cantly between referred and nonreferred boys ob-
served in the same classroom settings by observ-
ers blind to referral status. The convergent validity
of the DOF was supported by significant correla-
tions of .37 to .51 between DOF Total Problems
and TRF Total Problems (Achenbach & Edelbrock,
1986; Reed & Edelbrock, 1983) and significant
associations between DOF scores and CBCL pro-
file patterns (McConaughy et al., 1988).
McConaughy et al. (1998, 1999) also found that
DOF On-task, Internalizing, Nervous-Obsessive,
and Depressed scale scores significantly discrimi-
nated between outcomes for at-risk children who
received different school-based programs to pre-
vent serious emotional disturbance.
In another study, Skansgaard and Burns (1986)
added nine items to the 1986 DOF to create a priori
problem scales for Inattention, Hyperactivity/Im-
pulsivity, Oppositional Defiant Disorder/overt
Conduct Disorder (ODD/overt CD), and Slow
Cognitive Tempo (SCT). For a sample of 24 chil-
dren, inter-rater reliabilities were .97, .95, .99, and
.69 for each of these four scales, respectively, plus
1.00 for the DOF On-task score. To test the dis-
criminative validity of this 106-item DOF,
Skansgaard and Burns grouped the 24 children into
an ADHD-Combined subtype (ADHD/C; n = 6),
ADHD-Inattentive subtype (ADHD/I; n = 6), and
matched controls (n = 12), based on percentile
cutpoints on teachers ratings of DSM-IV ADHD
symptoms. Despite the small sample sizes,
Skansgaard and Burns found that the ADHD-C and
ADHD-IN groups both scored significantly higher
on the DOF Inattention scale and lower on DOF
On-task scores than matched controls. The ADHD-
C group scored significantly higher than the
ADHD-IN group and controls on the DOF Hyper-
activity/Impulsivity and ODD/overt CD scales. The
ADHD-IN group scored significantly higher than
controls on the DOF SCT scale. Similar group dif-
ferences were reported for teachers ratings on com-
parable subsets of DSM-IV symptoms, except that
ADHD-IN scored significantly higher than both
ADHD-C and controls on SCT.
In 2003, we created an expanded version of the
1986 DOF for use in a large study of children with
behavioral and emotional problems and matched
controls (McConaughy & Achenbach, 2003). The
2003 DOF included 114 specific problem items,
plus an open-ended item for the observer to write
in any observed problems or behaviors not listed
above. We retained 95 of the 1986 DOF problem
items, plus the open-ended item for other problems.
We added 19 new items to expand the 1986 DOF
for research, as follows:
We replaced the 1986 item, 4. Behaves like op-
posite sex, with a new item, 4. Cheats. We changed
the 1986 item, 75. Underactive, slow moving, lacks
energy, or yawns, by removing the word, yawns
and created a new item 114. Yawns. We changed
the 1986 item, 83. Fails to express self clearly, in-
cluding speech defects, by removing the words, in-
cluding speech defects and created a new item
97. Speech problem (describe). We created 16 ad-
ditional items to correspond to TOF items and prob-
lems that observers had written in for the open-
ended item in earlier research, plus items to tap
DSM-IV (American Psychiatric Association, 1994)
symptoms for ADHD that were not covered by the
original 1986 DOF items. We also slightly re-
worded eleven 1986 DOF items.
We used data from samples rated on the 1986
DOF and the 2003 DOF to develop the 2009 DOF
with 89 problem items and to derive scales for the
2009 DOF Profile, as discussed in the next sec-
tions. (Appendix D shows the final 89 items of the
2009 DOF in comparison to the 97 items of the
1986 DOF.)
PSYCHOMETRIC APPROACH TO
THE 2009 DOF
Consistent with previous research, we designed
the 2009 DOF to obtain direct observational data
on childrens problems and on-task behavior in
school classrooms, recess, and comparable group
settings. As part of the ASEBA, the DOF yields
scores for observed problems that can be meshed
with data from other sources, such as parent re-
ports, teacher reports, childrens self-reports, test
session observations, and observations from child
clinical interviews. This facilitates a multiaxial
approach to assessment, as discussed in Chapter
1.
To develop the 2009 DOF and DOF Profile, we
used a psychometric approach similar to that used
for other ASEBA forms and profiles, including the
CBCL/6-18, TRF, and YSR (Achenbach &
Rescorla, 2001), SCICA (McConaughy &
Achenbach, 2001), and TOF (McConaughy &
Achenbach, 2004). We used the following proce-
dures:
1. We selected and tested a pool of items that de-
scribe observable aspects of childrens behav-
ior, affect, and interactions in group settings.
2. We obtained observers ratings of problem items
for 6-11-year-old clinically referred children and
matched control children in the same settings.
3. We factor analyzed the observers ratings to ag-
gregate problem items into quantitative syn-
drome scales.
4. We constructed a DSM-oriented Attention Defi-
cit/Hyperactivity Problems scale comprised of
problem items that are consistent with DSM-IV-
TR criteria for ADHD.
5. We constructed a Total Problems score consist-
ing of the sum of ratings on the 89 problem
items.
6. We assigned standard scores (T scores) and per-
centiles for the DOF problem scales and the On-
task score that indicate how a childs scores com-
pare with scores for normative samples of chil-
dren.
7. We tested the problem scales and On-task score
for reliability and validity.
The next sections describe our research for steps
1 through 6. Chapter 7 reports reliability and Chap-
ter 8 reports validity for the 2009 DOF scales.
STATISTICAL DERIVATION OF DOF
SYNDROMES FOR CLASSROOM
OBSERVATIONS
Like other ASEBA forms, the DOF was devel-
oped both to document specific problems and to
identify co-occurring problems. We used various
statistical procedures to identify syndromes of co-
occurring problems. The original Greek meaning
of the word syndrome is the act of running together.
Although syndrome is often equated with dis-
ease, its most general meaning is a set of concur-
rent things (Gove, 1971). Consistent with this
meaning, we performed a series of factor analyses
to identify syndromes of childrens problems that
were observed in school classrooms.
Factor analysis refers to a family of statistical
methods for identifying patterns of co-occurring
items. Because different factor-analytic approaches
may produce different results, we used several ap-
proaches that included both exploratory factor
analysis (EFA) methodology (SPSS, 2007) and
confirmatory factor analysis (CFA) methodology
(Mplus; Muthn & Muthn, 2004). EFA yields fac-
tors that summarize the associations among prob-
lem items without testing specific models for the
factor structure, whereas CFA tests the fit of data
to particular measurement structures.
Samples for Factor Analyses
For the factor analyses, we assembled a sample
of 1,261 children ages 6-12 who were observed in
school classrooms. Of these, 486 were rated on the
1986 DOF and 775 were rated on the 2003 DOF.
The total sample included 649 children who were
clinically referred for evaluations of behavioral and
emotional problems and/or learning difficulties or
were identified as at risk for such problems (for
brevity, we are labeling this group referred). The
referred samples were drawn from outpatient clin-
ics and schools in Vermont, New York, and Penn-
sylvania. The Vermont sample included children
referred to the outpatient clinic of the University
of Vermont Department of Psychiatry. Children in
the Vermont sample were drawn from urban, semi-
urban, and rural areas. The New York sample in-
cluded children referred to the outpatient clinic of
the Department of Psychiatry at SUNY Upstate
Medical University in Syracuse, New York. The
Pennsylvania sample included children referred to
the outpatient clinic of The Childrens Hospital of
Philadelphia. An additional 612 children were ran-
domly selected control children in the same class-
rooms as the referred children. The total sample of
1,261 children included 873 boys and 388 girls,
each having 2 to 4 DOFs. Each DOF covered a 10-
minute observation.
From the total sample of 1,261 children, we se-
lected two samples for EFAs. One sample included
955 children who were rated on either the 1986 or
2003 DOF. This sample included 649 referred chil-
dren plus 306 matched control children who scored
above the DOF median Total Problems score of
5.9 that we found for controls. As explained later,
this sample was used for EFAs of 41 high frequency
problem items included in both the 1986 and 2003
DOF. The second EFA sample was on a subset of
the first sample of 955 children, which included
613 children who were rated only on the 2003 DOF
(335 referred and 278 controls). This sample was
used for EFAs of 57 high frequency problem items
included in the 2003 DOF. We used two different
samples for initial EFAs to determine whether the
factor structure differed for the 41-item set versus
the larger 57-item set from the 2003 DOF. The
sample for subsequent CFAs included all 1,261
children (649 referred and 612 controls). We com-
bined boys and girls in all analyses to maximize
our sample sizes. Table 6-1 summarizes the
samples used for factor analyses.
Items for Factor Analyses
To select DOF items for factor analyses, we
omitted the open-ended item for other problems
and combined two new items on the 2003 DOF
with two other items with similar wording, reduc-
ing the total item set to 112 items. Of these, 95
items were included on both the 1986 DOF and
2003 DOF. To obtain item frequencies, we aver-
aged the 0, 1, 2, and 3 ratings for each of 112 DOF
items across the 2 to 4 DOFs for each of the 1,261
children in the total sample shown in Table 6-1.
From the frequency distributions of averaged item
ratings, we identified 55 low frequency items that
were rated present (>0.00) for fewer than 5% of
referred children and fewer than 3% of referred
and control children combined. Omitting these 55
low frequency items, we retained 57 items from
the 2003 DOF that were rated present (>0.00) for
>5% of referred children and for >3% of referred
and control children combined. We then identified
41 of the 57 items from the 2003 DOF that were
also rated on the 1986 DOF. We used these two
item sets for initial EFAs, as described in the next
section.
Factor-Analytic Methods
EFAs for Deriving Factors. As a general strat-
egy, we performed exploratory Maximum Likeli-
hood (ML), Unweighted Least Squares (ULS), and
Principal Components Analyses (PCA) of Pearson
Table 6-1
Samples for Factor Analyses to Derive 2009 DOF Syndromes
Boys Girls Total
Total Sample
a
Referred children 464 185 649
Controls 409 203 612
Total 873 388 1,261
Sample for EFAs of 41 high frequency items from
1986 & 2003 DOF
b
Controls 219 87 306
Total 683 272 955
Sample for EFAs of 57 high frequency items from
2003 DOF
c
Controls 197 81 278
Total 428 185 613
Ethnicity
d
Non-Latino White 79.4%
African American 14.5%
Latino/Hispanic 2.0%
Mixed or Other 4.1%
a
This sample was used for CFAs.
b
This sample was a subset of N = 1,261.
c
This sample was a subset of N = 955.
d
Percents for N = 1,124 referred and control children for whom ethnicity was known.
for each child. The single-factor WLS analyses
were applied to 3,533 DOFs for the entire sample
of 1,261 children. The single-factor WLS analy-
ses identified five unidimensional factors, with
29 items loading on only one factor and 16 items
loading on more than one factor.
CFA Tests of the 5-Factor Model. To assign
items to only one factor for a final 5-factor model,
we performed a correlated 5-factor WLS analysis
of the candidate factors for the 3,533 DOFs for the
total sample of 1,261 children. From these analy-
ses, we identified 43 items meeting criteria (a) and
(b) above: 10 for Factor 1; 7 for Factor 2; 12 for
Factor 3; 6 for Factor 4; and 8 for Factor 5. An
additional item loaded .18 on Factor 2. Only one
of the 45 items from the single-factor solutions
failed to load significantly on any factor in the test
of the 5-factor model.
We then examined correlations between di-
chotomous item scores (0 vs. 1-2-3) and latent vari-
ables for the five factors for all items dropped from
the factors in previous analyses. We looked for
items that had correlations > .40 with at least one
factor and a difference > .10 between that correla-
tion and correlations with the remaining four fac-
tors. Seven items met these criteria. To obtain a
final 5-factor solution, we then tested models with
and without these seven items for the 3,533 DOFs
for the total sample of 1,261 children.
To obtain a final 5-factor solution, we used CFA
methodology in an exploratory manner, rather than
seeking confirmation of factor models. We ex-
amined solutions for the following characteristics:
(a) proper convergence; (b) no out-of-range param-
eter estimates; (c) reasonable model fit; and (d) re-
tention of items with factor loadings > .20 and sig-
nificant at p <.01. For the final 5-factor solution,
52 items met criteria (a) through (d): 12 for Factor
1; 10 for Factor 2; 12 for Factor 3; 8 for Factor 4;
and 10 for Factor 5. We evaluated goodness-of-fit
between the data and the models with the Root
Mean Square Error of Approximation (RMSEA;
Browne & Cudek, 1993), which has been recom-
mended as the best measure of fit (Loehlin, 1998).
correlations among the retained DOF high fre-
quency items. The initial EFAs that yielded 3 to 10
factors were subjected to Varimax (orthogonal)
rotations to produce uncorrelated factors and
Oblimin (oblique) rotations to allow correlations
among factors. Using these general strategies, we
performed six separate EFAs (3 methods x 2 rota-
tions) on the 41 high frequency items included on
the 1986 DOF and 2003 DOF, using the sample of
children rated on either of the two forms (N =
955). We then performed an additional six EFAs
on the 57 high frequency items included on the
2003 DOF, using the sample of children rated only
on the 2003 DOF (N = 613). We found similar fac-
tor structures from the 1986/2003 DOF 41-item
set and the analyses of the 2003 DOF 57-item set.
We therefore used solutions from the EFAs of the
2003 DOF 57-item set for our next analyses.
We identified five factors that were similar in
the six EFAs of the 2003 DOF 57-item set. We
retained DOF items that had loadings > .20 and p
<.01 on at least one of the five factors. The differ-
ent factor extraction and rotation methods thus
collectively contributed results that were subse-
quently tested via CFA methods, as described in
the next two sections.
CFAs for Evaluating the Unidimensionality of
Factors. To test the unidimensionality of factors
derived in the forgoing analyses, we applied single-
factor Weighted Least Squares (WLS) analyses to
candidate items comprising each of the five candi-
date factors. Items that loaded on multiple versions
of a particular factor were included if they met the
following criteria: (a) the items factor loading had
to be significant at p <.01, i.e., the estimated factor
loading had to exceed its standard error by at least
2.57, and (b) the loading had to be >.20. To avoid
statistical risks associated with low frequency cells,
we applied the WLS analyses to tetrachoric corre-
lations between item scores dichotomized 0 vs. 1,
2, and 3. Because Mplus can take account of de-
pendency in a data set (i.e., more than one DOF
per subject), we were able to analyze ratings from
each DOF separately, rather than entering the mean
of the ratings on each item for all DOFs completed
Table 6-2
Factor Loadings of Items on the DOF Syndrome Scales
for Classroom Observations
DOF Syndrome Scale Factor Loading
I. Sluggish Cognitive Tempo
11. Confused or seems to be in a fog .71
15. Daydreams or gets lost in thoughts .53
27. Forgetful in activities or tasks .47
44. Apathetic, unmotivated, or wont try .78
51. Slow to respond verbally .55
53. Shy or timid .32
57. Stares blankly .58
60. Yawns .22
70. Underactive, slow moving, tired, or lacks energy .50
71. Unhappy, sad or depressed .61
II. Immature/Withdrawn
1. Acts too young for age .87
25. Difficulty organizing activities or tasks .49
26. Fails to give close attention to details .57
34. Physically isolates self from others .43
39. Loses things .58
49. Avoids or is reluctant to do tasks that require sustained mental effort .62
59. Wants to quit or does quit tasks .66
61. Strange behavior .27
75. Withdrawn, doesnt get involved with others .31
77. Fails to express self clearly .36
III. Attention Problems
7. Doesnt concentrate or doesnt pay attention for long .74
9. Doesnt sit still, restless, or hyperactive .61
13. Fidgets, including with objects .55
24. Eats, drinks, chews, or mouths things that are not food, excluding junk foods .31
42. Picks or scratches nose, skin, or other parts of body .23
56. Easily distracted by external stimuli .62
76. Sucks thumb, fingers, hand, or arm .28
82. Clumsy, poor motor control .52
IV. Intrusive
8. Difficulty waiting turn in activities or tasks .73
17. Tries to get attention of staff .45
21. Disturbs other children .63
32. Interrupts .68
33. Impulsive or acts without thinking, including calling out in class .66
The RMSEA for the final 5-factor solution was
.024, which was well within the range <.07 gener-
ally considered to indicate good fit.
Results of Factor Analyses
Table 6-2 shows the five DOF syndrome scales
with the factor loadings for each item derived from
the final 5-factor solution. The names of the syn-
drome scales reflect the content of the items com-
prising each factor. We chose names that were con-
sistent with current literature and with the names
of scales derived from similar factor analyses of
the 1986 DOF and other ASEBA forms. The syn-
dromes are numbered in Table 6-2 according to
the order in which they appear on the DOF Pro-
file. The order of the syndrome scales was deter-
mined by subsequent second-order factor analyses
described in a later section.
The DOF Sluggish Cognitive Tempo syndrome
includes 10 items describing confusion, lack of mo-
tivation, and underactivity. Items with the highest
factor loadings were: 11. Confused or seems to be
in a fog; 44. Apathetic, unmotivated, or wont try;
and 71. Unhappy, sad, or depressed. Five items
(11, 15, 44, 57, and 75) were consistent with the
2007 Sluggish Cognitive Tempo scales created for
the CBCL/6-18 and TRF (Achenbach & Rescorla,
2007). Interestingly, symptoms of sluggish cogni-
Table 6-2 (cont.)
DOF Syndrome Scale Factor Loading
45. Responds before instructions are completed .41
46. Disrupts group activities .79
55. Demands must be met immediately, easily frustrated .73
65. Talks too much .50
72. Unusually loud .65
78. Impatient .76
81. Easily led by peers .51
V. Oppositional
2. Makes odd noises .51
3. Argues .68
5. Defiant or talks back to staff .75
16. Difficulty following directions .67
20. Disobedient .76
22. Doesnt seem to feel guilty after misbehaving .87
23. Doesnt seem to listen to what is being said .60
43. Runs about or climbs excessively .50
52. Shows off, clowns, or acts silly .40
74. Whining tone of voice .58
83. Doesnt get along with peers .54
87. Complains .71
Note. N = 1,261 with 3,533 DOFs for the final Weighted Least Squares factor analyses of tetrachoric
correlations of dichotomous item scores; referred children, n = 649; matched controls, n = 612. Values in
bold show the three highest loadings for each factor.
gressive Behavior and SCICA Self-Control Prob-
lems syndromes. Examples are: 3. Argues; 5. De-
fiant or talks back to staff; and 20. Disobedient.
The Oppositional syndrome does not include prob-
lems reflecting physical aggression, such as fight-
ing, which are unlikely to be observed in class-
room settings.
LOW FREQUENCY ITEMS RETAINED
ON THE DOF
To finalize the DOF, we examined the frequency
distributions of averaged item ratings for the
sample of 649 referred children who were rated on
the 1986 DOF or 2003 DOF in their classrooms,
plus 232 of the referred children who were rated
on the 1986 DOF during recess. We retained items
that were scored present (>0.00) for >2% of the
classroom and recess samples. We retained 23
items from classroom observations and an addi-
tional 7 items from recess observations. These 30
items, plus 6 additional items that were not on the
five DOF syndromes, and open-ended item 89.
Other problems not listed above were grouped to-
gether as Other Problems, as shown in Table 6-
3. The 37 Other Problems, plus the 52 items on the
DOF syndromes, are included in the final 2009
version of the DOF, which thus has 88 specific
problem items, plus one open-ended item. The 0-
1-2-3 ratings on all 89 items are summed to com-
pute the DOF Total Problems score, as explained
in a later section.
AGGRESSIVE BEHAVIOR SYNDROME
FOR RECESS OBSERVATIONS
As explained in previous sections, five DOF
syndrome scales were derived from observations
of children in classroom settings. Because activi-
ties in classrooms are often teacher-directed and
structured around a curriculum, children may be
less likely to exhibit certain types of problem be-
haviors in the classroom than in less structured
settings, such as recess. Examples are getting into
fights, teasing, and being teased. To determine
whether there were any syndromes for recess ob-
tive tempo were tested for possible inclusion in
the DSM-IV criteria for ADHD, but were not in-
cluded in the final criteria (Frick et al., 1994). As
described in an earlier section, a prior research
study using an expanded version of the 1986 DOF
showed a significant association between a DOF
scale labeled Slow Cognitive Tempo and the In-
attentive type of ADHD (Skansgaard & Burns,
1998).
The DOF Immature/Withdrawn syndrome in-
cludes 10 items describing problems of immatu-
rity and disorganization, along with withdrawn be-
havior. Items with the highest factor loadings were:
1.Acts too young for age; 49. Avoids or is reluc-
tant to do tasks that require sustained mental ef-
fort; and 59. Wants to quit or does quit tasks. Five
items (25, 26, 39, 49, and 59) were consistent with
items comprising a DOF DSM-oriented Inatten-
tion scale described in a later section. Item 75. With-
drawn, doesnt get involved with others was con-
sistent with similar items on the CBCL/6-18, TRF,
and TOF Withdrawn/Depressed syndrome, as well
as the 1986 DOF Withdrawn-Inattentive syndrome.
The DOF Attention Problems syndrome in-
cludes eight items describing difficulty with atten-
tion and restlessness. Items with the highest factor
loadings were: 7. Doesnt concentrate or doesnt
pay attention for long; 9. Doesnt sit still, restless,
or hyperactive; and 13. Fidgets, including with ob-
jects. Four items (7, 9, 13, and 56) were consistent
with items comprising the 1986 DOF Hyperactive
scale and the TOF and TRF Attention Problems
scales.
The DOF Oppositional syndrome was the most
robust factor to emerge from our analyses. It in-
cludes12 items that reflect oppositional or unco-
operative behavior. The highest loading items were:
5. Defiant or talks back to staff; 20. Disobedient;
and 22. Doesnt seem to feel guilty after misbehav-
ing. Five items (3, 5, 20, 52, and 105) were consis-
tent with items on the TOF Oppositional syndrome.
The Oppositional syndrome also contains items
with counterparts on the CBCL/6-18 and TRF Ag-
Table 6-3
DOF Other Problems Item Set
DOF Items
4. Cheats
6. Brags, boasts
10. Clings to adults or too dependent
12. Cries
14. Cruel, bullies, or mean to others
18. Destroys own things
19. Destroys property belonging to others
28. Out of seat
29. Gets hurt, accident prone
30. Gets in physical fights
31. Gets teased
35. Lies
36. Bites fingernails
37. Nervous, highstrung, or tense
38. Nervous movements, twitching, tics, or other unusual movements (describe):
40. Too fearful or anxious
41. Physically attacks people
47. Screams
48. Secretive, keeps things to self, including refusal to show things to teacher
50. Self-conscious or easily embarrassed
54. Explosive or unpredictable behavior
58. Speech problem (describe):
62. Stubborn, sullen, or irritable
63. Sulks
64. Swears or uses obscene language
66. Teases
67. Temper tantrums, hot temper, or seems angry
68. Threatens people
69. Too concerned with neatness or cleanliness
73. Overly anxious to please
79. Tattles
80. Repeats behavior over & over; compulsions (describe):
84. Runs out of class (or similar setting)
85. Behaves irresponsibly (describe):
86. Bossy
88. Afraid to make mistakes
89. Other problems not listed above
servations that were not identified in classroom
observations, we performed additional factor analy-
ses of 35 items from the Other Problems shown
in Table 6-3. We excluded item 28. Out of seat,
which was not on the 1986 DOF, and would not
have been relevant for recess observations. We also
excluded open-ended item 89. The factor analyses
were performed on the sample of 232 clinically re-
ferred children ages 6-11 whose recess observa-
tions were rated on the 1986 DOF, plus 248 matched
control children. (Of the 232 referred children, 124
had two matched control children in the same re-
cess setting.) Each child was rated on the DOF for
two 10-minute observations during recess, alter-
nating between control and referred children.
From the total sample of 480 children observed
at recess, we obtained frequency distributions of
averaged 0, 1, 2, 3 item ratings for each of the 35
items. From the frequency distributions of averaged
item ratings, we identified 12 DOF items that were
scored present (>0.00) for >5% of referred chil-
dren and >3% of control children. We then applied
single-factor ULS analyses to the 12 candidate
items to test the unidimensionality of a single-fac-
tor solution. Consistent with criteria for the five
syndromes for classroom observations, we retained
items with (a) factor loadings significant at p <.01,
i.e., the estimated factor loading had to exceed its
standard error by at least 2.57, and (b) loadings
>.20. Table 6.4 shows the factor loadings for the
nine items that met these criteria for a recess ob-
servation scale, which we labeled Aggressive Be-
havior. As expected, the Aggressive Behavior syn-
drome included problems with physical aggression
as well as other social problems, such as teasing
and being teased. The three highest loading items
were: 30. Gets into physical fights; 41. Physically
attacks people; and 14. Cruel, bullies, or mean to
others. Most of the items comprising the DOF
Aggressive Behavior syndrome for recess obser-
vations have counterparts on the CBCL/6-18 and
TRF Aggressive Behavior syndromes (Achenbach
& Rescorla, 2001).
DSM-ORIENTED ATTENTION DEFICIT/
HYPERACTIVITY PROBLEMS AND
INATTENTION AND HYPERACTIVITY-
IMPULSIVITY SUBSCALES
To aid practitioners and researchers in diagnos-
tic assessments, Achenbach and Rescorla (2001)
Table 6-4
Factor Loadings of Items on the DOF Aggressive Behavior
Syndrome Scale for Recess Observations
DOF Items Factor Loading
14. Cruel, bullies, or mean to others .40
30. Gets in physical fights .52
31. Gets teased .21
41. Physically attacks people .45
47. Screams .23
63. Sulks .34
66. Teases .24
79. Tattles .33
86. Bossy .23
Note. N = 480 for Unweighted Least Squares single-factor analyses of averaged item scores; referred
children, n = 232; matched controls, n = 248. Values in bold show the three highest factor loadings.
tained mental effort. Three other DOF items were
also consistent with DSM-IV ADHD symptoms:
28.Out of seat; 45. Responds before instructions
are completed; and 55. Demands must be met im-
mediately, easily frustrated. To cover all possible
DSM-IV ADHD symptoms, we added the above
11 items to the 12 items that the experts judged to
be very consistent with DSM-IV ADHD symptoms.
We then assigned the 23 items to Inattention
and Hyperactivity-Impulsivity subscales, as shown
in Table 6-5. Of these 23 items, 21 were similar to
items on the TOF Attention Deficit/Hyperactivity
Problems scale and its Inattention and Hyperac-
tivity-Impulsivity subscales (McConaughy &
Achenbach, 2004). Items in italic are similar to
items identified by experts for the CBCL/6-18 and
TRF Attention Deficit/Hyperactivity Problems
scales, while non-italicized items are the additional
DOF items consistent with DSM-IV symptoms.
The Attention Deficit/Hyperactivity Problems To-
tal score is the sum of the 0, 1, 2, and 3 ratings for
all 23 items.
NORMATIVE SAMPLE
For classroom observations, the DOF norma-
tive sample included 661 children ages 6-11, as
shown in Table 6-6. These were randomly selected
children in general education classrooms in four
states: Arizona (n = 65), New York (n = 146), Penn-
sylvania (n = 172), and Vermont (n = 278). The
DOF normative sample for recess observations
included 244 Vermont children ages 6-11, who
were a subsample of the normative sample for
classroom observations. Each child in the norma-
tive samples was observed and rated on the DOF
for two to four 10-minute periods for classroom
observations and two 10-minute observations for
recess observations. The 0-1-2-3 ratings on each
of the DOF items were averaged across the 2 to 4
DOFs for each child. The averaged item scores
were then summed to obtain total raw scores for
each of the relevant DOF scales for classroom ob-
servations and recess observations.
To test age and gender differences in the nor-
constructed DSM-oriented scales comprising
CBCL/6-18, TRF, and YSR items that mental
health experts judged to be very consistent with
DSM-IV (American Psychiatric Association, 1994)
diagnostic categories. To do this, they asked the
experts to rate items from all three ASEBA forms
as very consistent, somewhat consistent, or not
consistent with descriptive criteria for several
DSM-IV diagnostic categories. The raters were 22
highly experienced child psychiatrists and psy-
chologists from 16 cultures. All the raters had pub-
lished research on childrens behavioral and emo-
tional problems. Raters were given the DSM-IV
criteria for guidance, but one-to-one matching of
DSM-IV criteria to ASEBA items was not neces-
sary to justify ratings of very consistent. Some
ASEBA items could thus be judged as very con-
sistent with the experts concepts of particular
DSM-IV categories, even if the DSM-IV criteria
did not include precise counterparts of the ASEBA
items. ASEBA items that were rated as very con-
sistent with the DSM-IV categories by at least 14
of the 22 raters were grouped into six DSM-ori-
ented scales: Affective Problems, Anxiety Prob-
lems, Somatic Problems, Attention Deficit/Hyper-
activity Problems, Oppositional Defiant Problems,
and Conduct Problems (for details, see Achenbach
& Rescorla, 2001).
To create the DOF DSM-oriented Attention
Deficit/Hyperactivity Problems scale, we selected
DOF items that were comparable to the CBCL/6-
18 and TRF items that the experts rated as very
consistent with the DSM-IV diagnosis of ADHD.
We identified 12 DOF items that were similar to
CBCL/6-18 and TRF items.
As indicated earlier, to develop the 2003 DOF,
we also wrote new items to tap DSM-IV symp-
toms of ADHD that were not already covered by
other items: 8.Difficulty waiting turn in activities
or tasks; 23. Doesnt seem to listen to what is be-
ing said; 25. Difficulty organizing activities or
tasks; 26. Fails to give close attention to details;
27. Forgetful in activities or tasks; 39. Loses things;
43. Runs about or climbs excessively; and 49.
Avoids or is reluctant to do tasks that require sus-
mative sample, we performed a 2 (ages 6-8 vs. ages
9-11) x 2 (boys vs. girls) MANOVA on raw scale
scores for the five DOF syndromes, followed by
univariate 2 x 2 ANOVAs on scores for each syn-
drome scale. We performed a similar 2 x 2
MANOVA, followed by univariate ANOVAs, on
the DOF Inattention and Hyperactivity-Impulsiv-
ity scales, and 2 x 2 univariate ANOVAs on the
Attention Deficit/Hyperactivity Problems scale,
Total Problems-Classroom, Aggressive Behavior,
Total Problems-Recess, and On-task scores. As
shown in Table 6-7, boys scored significantly
higher than girls on 6 of 10 DOF scales for class-
room observations. There were no significant gen-
der differences for recess observations. Significant
age effects were found only on the Immature/With-
drawn syndrome, on which children ages 6-8
scored significantly higher (Mean = .23, SD = .58)
Table 6-5
Items Comprising the DOF DSM-Oriented Attention Deficit/Hyperactivity Problems Scale
and Inattention and Hyperactivity-Impulsivity Subscales
Inattention Subscale
7. Doesnt concentrate or pay attention for long
16. Difficulty following directions
23. Doesnt seem to listen to what is being said
25. Difficulty organizing activities or tasks
26. Fails to give close attention to details
27. Forgetful in activities or tasks
39. Loses things
49. Avoids or is reluctant to do tasks that require sustained mental effort
56. Easily distracted by external stimuli
59. Wants to quit or does quit tasks
Hyperactivity-Impulsivity Subscale
8. Difficulty waiting turn in activities or tasks
9. Doesnt sit still, restless, or hyperactive
13. Fidgets, including with objects
21. Disturbs others
28. Out of seat
32. Interrupts
33. Impulsive or acts without thinking, including calling out in class
43. Runs about or climbs excessively
45. Responds before instructions are completed
46. Disrupts group activities
55. Demands must be met immediately, easily frustrated
65. Talks too much
72. Unusually loud
Note. Items in italics have counterparts on the CBCL/6-18 and TRF Attention Deficit/Hyperactivity
Problems scales. All but two DOF items (21 and 46) have counterparts on the TOF Attention Deficit/
Hyperactivity Problems scale. The Attention Deficit/Hyperactivity Problems scale score is the sum of
0-1-2-3 ratings on the Inattention and Hyperactivity-Impulsivity subscales.
than children ages 9-11 (Mean = .12, SD = .40), p
= .005, Eta
2
= .012. We constructed norms sepa-
rately for boys and girls in each setting, as described
in the next section.
ASSIGNING NORMALIZED T SCORES
TO RAW SCORES
The sums of the averaged 1, 2, and 3 ratings on
the items of the DOF problem scales provide con-
tinuous distributions of scores that indicate the
degree to which problems are reported for each
child on each scale. The DOF On-task score also
provides continuous raw scores ranging from 0 to
10 in 0.5 increments. These raw scale scores are
especially useful for statistical analyses, because
they reflect all the variation that is possible on
each scale. To help users see how an individual
Table 6-6
Characteristics of Normative Samples for the DOF
Boys Girls Total
Ages
6 79 45 124
7 92 44 136
8 68 59 127
9 73 43 116
10 56 26 82
11 35 41 76
Total 403 258 661
Recess Observations
Ages 32 6 38
6 26 18 44
7 34 18 52
9 32 18 50
10 28 8 36
11 18 6 24
Total 170 74 244
Ethnicity
a
Native American 8.9%
Asian 2.1%
Mixed or Other 0.3%
Unknown 1.2%
a
Percentages of total N = 661 for classroom observations. (Recess observations were obtained on a
subsample of children used for classroom observations.)
Scales
We assigned normalized T scores to the total
raw scores of each DOF problem scale according
to the percentiles found for the raw score distribu-
tions in each normative sample. For each DOF
scale, we computed midpoint percentiles for each
total raw score according to procedures specified
by Crocker and Algina (1986, p. 439). According
to this procedure, a raw score that occupies a par-
ticular percentile of the cumulative frequency dis-
tribution is assumed to also occupy all the next
lower percentiles down to the percentile occupied
by the next lower raw score. To re-present the range
childs scores on each scale compare with scores
from the normative sample, we assigned normal-
ized T scores to the total raw scores for each DOF
scale. The T scores are standard scores that com-
pare the childs standing on a scale with the distri-
bution of scores obtained by the normative
sample of children of the same gender for the same
setting (classroom or recess). This enables users
to see whether a childs scale scores are high or
low compared to normal peers. The T scores also
enable users to compare a childs standing on each
scale with the childs standing on the other scales.
Assigning T scores to the DOF Problem
Table 6-7
Means and Standard Deviations of DOF Raw Scale Scores for the Normative Samples
Boys Girls
DOF Scales Mean SD Mean SD Eta
2
Empirically Based Scales
Sluggish Cognitive Tempo .91
a
1.27 .68 .91 .010
Immature/Withdrawn .21 .56 .13 .44 ns
Attention Problems 4.45
a
3.10 3.90 2.66 .007
Intrusive 1.07 1.54 .99 1.43 ns
Oppositional .97
a
1.57 .67 1.31 .009
Total Problems-Classroom 8.79
a
5.95 7.36 5.09 .014
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems 5.41
a
3.95 4.70 3.52
b
.007
Inattention subscale 1.76 1.91 1.53 1.70 ns
Hyperactivity-Impulsivity subscale 3.66
a
2.63 3.17 3.39 .008
On-task 8.64 1.57 8.86 1.57 ns
Recess Observations
Aggressive Behavior .48 .75 .49 .75 ns
Total Problems-Recess 1.56 2.07 1.56 2.15 ns
Note. N = 661 for classroom observations; N = 244 for recess observations.
a
Boys > girls, p <.05.
b
Not significant when corrected for the number of comparisons (Sakoda, Cohen & Beall, 1954).
of percentiles occupied by a raw score, the raw
score is assigned to the midpoint of the percentiles
that it occupies.
For example, on the DOF Attention Problems
syndrome for classroom observations, we found
that a raw score of 6 spanned the 71
st
through 75
th
percentiles for the normative sample of 6-11-year-
old boys. We therefore assigned the score of 6 to
the 73
rd
percentile, which is midway between the
71
st
and 75
th
percentiles. According to the proce-
dure for assigning normalized T scores to raw
scores (Abramowitz & Stegun, 1968), the 73
rd
per-
centile score should get a T score of 56. To pro-
vide a common metric for the five DOF syndrome
scales for classroom observations, we assigned nor-
malized T scores from 50 through 70 according to
the midpoint percentile procedures.
We followed the same midpoint percentile pro-
cedure for assigning normalized T scores from 50
to 70 to all the DOF problem scales: the five syn-
dromes, DSM-oriented Attention Deficit/Hyperac-
tivity Problems scale, Inattention and Hyperactiv-
ity-Impulsivity subscales, and Total Problems for
classroom observations, as well as the Aggressive
Behavior syndrome and Total Problems for recess
observations. Procedures for truncating lower T
scores at 50 and for assigning T scores >70 are
described below. Procedures for assigning lower
and higher T scores to DOF Total Problems scores
for classroom observations are described in a sepa-
rate section.
Truncation of Lower T Scores at 50. The raw
scores of the DOF problem scales were all posi-
tively skewed in the normative sample, with large
proportions of children having scale scores of 0.
That is, more children in the normative sample re-
ceived very low than very high DOF problem
scores. Furthermore, because high scores are clini-
cally significant on problem scales, it is more im-
portant for the scales to make finer discriminations
among high scores than among low scores that are
at the bottom of the normal range.
If we based T scores directly on midpoint per-
centiles, the lowest T score for the Attention Prob-
lems syndrome for boys 6-11 would be 32, reflect-
ing the 4
th
midpoint percentile for boys who ob-
tained a score of 0. By contrast, the lowest T score
for the Oppositional syndrome would be 43, re-
flecting the 22
nd
midpoint percentile for boys 6-11
who obtained a score of 0 on this syndrome scale.
If these T scores were displayed on a profile for a
boy whose score was 0 on both syndrome scales,
the T score of 43 might suggest that the boy had
more problems on the Oppositional syndrome
scale, than on the Attention Problems syndrome
scale where the boys T score would be 32. This
difference in T scores would mask the fact that the
boy really had no problems on either syndrome
scale.
To avoid misleading impressions like those de-
scribed above, we truncated the assignment of T
scores, as recommended by Petersen, Kolen, and
Hoover (1993), and as done for other ASEBA forms
(Achenbach & Rescorla, 2000, 2001). To equalize
the starting points for the five syndrome scales for
classroom observations and the DSM-oriented At-
tention Deficit/Hyperactivity Problems scale and
subscales, we assigned a T score of 50 to raw scores
that fell at approximately the 50
th
percentile and
lower.
We also truncated T scores at 50 for the Ag-
gressive Behavior syndrome scale and Total Prob-
lems for recess observations. That is, we assigned
a T score of 50 to raw scores of 0 and then based
normalized T scores on midpoint percentiles for
Aggressive Behavior and Total Problems-Recess
up to the 98
th
percentile (T = 70).
Assignment of a T score of 50 to several raw
scale scores prevents users from overinterpreting
small differences among scores that are well within
the normal range. It also reduces differentiation
among low scores. However, loss of such differ-
entiation is of little practical importance, because
it involves differences that are all at the low end of
the normal range. If users nevertheless wish to pre-
serve differences at the low end of the normal
range, they can focus on the total raw scale scores.
For statistical analyses that do not involve com-
bining data across genders, raw scale scores are
usually preferable, because they directly reflect all
differences among scores without the effects of
truncation or other transformations.
Assigning T Scores Above 70 (>98
th
Percen-
tile). Most children in the normative samples ob-
tained scores that were well below the maximum
possible. It was therefore impossible to base the
highest T scores on percentiles, because the high-
est possible scores were spread over a tiny per-
centage of children in the normative sample. Be-
cause there were hardly any children in the norma-
tive samples on whom to base T scores above the
98
th
percentile (T >70), we assigned T scores from
71 to 100 in as many increments as there were re-
maining raw scores on each scale.
As an example, on the DOF Attention Problems
syndrome scale, the raw score of 11 (occupying
the 98
th
percentile) was assigned a T score of 70
for boys 6-11. Because there are eight items on the
scale, the maximum possible score is 24 (i.e., if a
boy received a rating of 3 on all eight items, the
boys raw scale score would be 24.) There are 30
intervals from 71 to 100, but 26 possible raw scores
from 11.5 through 24. (Because of averaging, DOF
raw scores include scores rounded to .5). To as-
sign T scores to the 26 possible raw scores, we
divided 30 by 26. Because 30/26 = 1.15, we as-
signed T scores to raw scores in intervals of 1.15.
Thus, a raw score of 11.5 was assigned a T score
of 70 + 1.15 = 71.15, rounded off to 71. A raw
score of 12 was assigned a T score of 71.15 + 1.15
= 72.30, rounded off to 72, and so on. The highest
possible raw score of 24 on Attention Problems
was assigned a T score of 100. By comparison, on
the Oppositional syndrome, a raw score of 5.5 (oc-
cupying the 98
th
percentile) was assigned a T score
of 70 for boys 6-11. The number of items on the
Oppositional syndrome is 12. Therefore, the high-
est possible score on the Oppositional syndrome
is 36, which was assigned a T score of 100.
We followed the same procedure for assigning
T scores above 70 to the DSM-oriented Attention
Deficit/Hyperactivity Problems scale, the Inatten-
tion and Hyperactivity-Impulsivity subscales, and
the Aggressive Behavior syndrome for recess ob-
servations. Our procedures for assigning T scores
to Total Problems are described below.
Assigning T scores to Total Problems
The DOF Total Problems score consists of the
sum of the 1, 2, and 3 ratings on all the specific
problem items of the DOF, plus the highest rating
(1, 2, or 3) for any problems written by the ob-
server in the spaces for the open-ended item 89.
Item 89 provides two spaces for adding problems
that are not listed elsewhere. However, only the
highest rating for added items is included in order
to limit the effects of idiosyncratic problems on
the Total Problems score. Separate Total Problems
scores are computed for classroom observations
and recess observations. There are gender-specific
norms for classroom and recess. To provide norm-
referenced scores for Total Problems, we computed
the scores obtained by each gender within each
setting. We then computed midpoint percentiles ac-
cording to the procedure described earlier for the
other DOF problem scales. We assigned T scores
to midpoint percentiles for Total Problems raw
scores, as described below.
No Truncation of Lower T Scores for Total
Problems-Classroom. There are more items on the
Total Problems scale than on any other scale, and
at least some of the items are endorsed for most
children. For classroom observations, relatively
few children in the normative samples obtained ex-
tremely low scores for Total Problems. It was there-
fore unnecessary to truncate Total Problems T
scores at 50 for classroom observations as we did
for other DOF problem scales. For Total Problems-
Classroom, the lowest raw score of 0 for boys and
for girls was assigned a T score of 33 (2
nd
percen-
tile). We then based normalized T scores directly
on midpoint percentiles for scores obtained by the
normative samples, up to the 98
th
percentile (T =
70).
For consistency in displaying scores on the DOF
Profile, the DOF computer-scoring program does
not print Total Problems-Classroom T scores be-
low 50. However, users can obtain these lower T
scores from the computer-scored data sets.
Truncation of Lower T Scores for Total Prob-
lems-Recess. For recess observations, 32% of boys
and 36% of girls in the normative samples obtained
raw scores of 0 for Total Problems. To take this
positive skew into account, we truncated T scores
at 50 for Total Problems-Recess, as done for other
DOF problem scales, as explained earlier.
Assigning T Scores Above 70 (>98
th
Percen-
tile) for Total Problems. No children in the nor-
mative or referred samples obtained DOF Total
Problems scores close to the maximum scores pos-
sible. If we followed the same procedure as for the
other problem scales, we would have compressed
the Total Problems scores actually obtained into a
narrow range of T scores. We would also have as-
signed a relatively broad range of T scores to raw
scores obtained by few or no children. To enable
the upper Total Problems-Classroom and Total
Problems-Recess T scores to reflect differences
among the raw scores that are most likely to occur,
we did the following: (a) we identified the five
highest scores obtained by boys and girls in the
normative and referred samples combined, sepa-
rately for classroom and recess; (b) we computed
the mean of the five highest scores for each gender
in each setting; (c) we assigned a T score of 89 to
the mean of the five highest raw scores for each
gender in each setting; (d) we then assigned T
scores 90 through 100 in equal intervals to the raw
scores that were above those that had been assigned
T = 89. We followed these procedures for Total
Problems-Classroom T scores >70 and Total Prob-
lems-Recess T scores >70.
Assigning T Scores to DOF On-Task
DOF On-task is only scored for classroom ob-
servations. To score On-task, an observer records
whether the child is on-task or off-task in the
last 5 seconds of each 1-minute interval for each
10-minute observation period. On-task is deter-
mined by the predominant activity sampling
method (i.e., the child must be doing what is ex-
pected for more than one half of the 5-second in-
terval). The number of on-task intervals are then
summed for each 10-minute observation period and
are averaged by the DOF computer-scoring pro-
gram across multiple observations. The averaged
On-task raw score can thus range from 0 to 10, in
increments of 0.5.
To provide norm-referenced scores for DOF On-
task, we obtained averaged raw scores for boys and
girls in the normative samples for classroom ob-
servations. We then computed midpoint percentiles
according to the procedures described earlier for
the DOF problem scales. The raw scores for DOF
On-task were all negatively skewed in the norma-
tive samples. That is, fewer children in the norma-
tive sample received very low than received very
high On-task scores. Furthermore, because low
scores are clinically significant for On-task, it is
more important to make finer discriminations
among low scores than among high scores. To take
account of the negatively skewed On-task scores
and the need for finer discrimination among low
than high scores, we assigned T scores to raw scores
in the following ways:
1. At the low end of the On-task scale, we
assigned a T score of 20 to On-task scores
of 0 for both boys and girls. We then
assigned T scores to raw scores of 0.5 to
9.5 based on the midpoint per-centiles. The
T scores ranged from 21 to 51 (53
rd
percentile) for girls and 21 to 53 (62
nd
percentile) for boys.
2. We assigned a T score of 60 to the
highest possible On-task raw score of
10 for both boys and girls, which was
above the 80
th
percentile for both
genders.
MEAN T SCORES
Appendix A shows the mean DOF T scores and
raw scores for the normative samples of boys and
girls for classroom observations and recess obser-
vations. For all DOF problem scales, except Total
Problems-Classroom, raw scale score distributions
are positively skewed and low scores are truncated
at T = 50. Consequently, the mean T scores are
above 50 and their standard deviations are below
10 in the normative samples. Raw scores are less
skewed for DOF Total Problems-Classroom. Thus,
the mean T scores for DOF Total Problems-Class-
room are closer to 50, and their standard devia-
tions are closer to 10 in the normative samples.
In contrast to the DOF problem scales, On-task
scores are negatively skewed and high scores are
truncated at T = 60. Thus, the mean T scores for
on-task are below 50 and their standard deviations
are below 10 in the normative samples.
Users should thus keep in mind that the T scores
for most DOF problem scales and T scores for On-
task deviate from the mean of 50 and standard de-
viation of 10 expected when normal bell-shaped
distributions are transformed directly into T scores.
Users should also keep in mind that the means and
standard deviations of the DOF scales may vary
from one sample of children to another. In particu-
lar, the means and standard deviations for prob-
lem scale scores obtained by samples of children
referred for mental health services are typically
higher than for nonreferred children. Examples of
this can be seen in Appendix B, which displays
means and standard deviations for scale scores ob-
tained by matched samples of referred children and
nonreferred control children observed in the same
settings. Scores for referred children are often less
skewed than for nonreferred children, because
fewer referred children obtain very low scores.
NORMAL, BORDERLINE, AND
CLINICAL RANGES
On the computer-scored DOF Profile shown in
Chapter 3, broken lines are printed across the
graphic displays to demarcate borderline and clini-
cal ranges for DOF scale scores. T scores from 65
to 69 (93
rd
through 97
th
percentiles) are considered
to be in the borderline clinical range for the DOF
syndrome scales, DSM-oriented Attention Deficit/
Hyperactivity Problems scale and Inattention and
Hyperactivity-Impulsivity subscales for classroom
observations, and the Aggressive Behavior scale
for recess observations. The borderline range indi-
cates scores that are high enough to be of concern,
but not so high as to be clearly deviant. T scores
>69 (>97
th
percentile) are considered to be in the
clinical range. T scores below 65 (<93
rd
percen-
tile) are considered to be in the normal range.
T scores from 60 to 63 (84
th
through 90
th
per-
centiles) are considered to be in the borderline clini-
cal range for DOF Total Problems-Classroom and
Total Problems-Recess. T scores >63 (>90
th
per-
centile) are considered to be in the clinical range.
T scores below 60 (<84
th
percentile) are consid-
ered to be in the normal range.
For DOF On-task, T scores from 31 to 35 (3
rd
to
7
th
percentiles) are considered to be in the border-
line clinical range. T scores <31 (<3
rd
percentile)
are considered to be in the clinical range. T scores
above 35 (>7
th
percentile) are considered to be in
the normal range. DOF On-task is scored only for
classroom observations.
As reported in Chapter 8, children who were
referred for mental health or special education ser-
vices scored significantly higher on the DOF prob-
lem scales and On-task than matched samples of
nonreferred control children in the same settings.
Because scores on the DOF problem scales are
quantitative measures of the number and degree
of problems observed for a child, the scores are
not intended to mark categorical differences be-
tween children who are sick vs. well. Instead,
the borderline and clinical ranges help users iden-
tify scores that are of enough concern to warrant
consideration for professional help. Users may
choose to apply higher or lower cutpoints for their
own clinical or research purposes. For example,
for some cases, or for certain clinical or research
purposes, scores at the high end of the normal range
(e.g., >90
th
percentile) on the syndrome scales or
Attention Deficit/Hyperactivity Problems scale
may also warrant concern. If you wish to classify
childrens scores dichotomously as clearly in the
normal vs. clinical range on the DOF syndrome
scales, Attention Deficit/Hyperactivity Problems
scale, and Aggressive Behavior, we suggest using
T scores below 65 to designate the normal range
vs. T scores >65 to designate the clinical range.
For DOF Total Problems, we suggest using T scores
below 60 to designate the normal range vs. T scores
>60 to designate the clinical range. For DOF On-
task, we suggest using T scores above 35 to desig-
nate the normal range vs. T scores <35 to desig-
nate the clinical range.
SUMMARY
We developed the DOF from observations of
childrens behavior in classroom and recess set-
tings. The DOF covers a 10-minute observation
window. We recommend obtaining 3 to 6 DOFs
for each identified child, plus additional DOFs for
control children in the same setting. The 2009 ver-
sion of the DOF contains 88 specific problem
items, plus one open-ended item for other prob-
lems. Each problem is rated on a 0-1-2-3 scale
ranging from 0 = no occurrence to 3 = definite oc-
currence with severe intensity, high frequency, or
3 or more minutes total duration. The DOF also
includes an On-task score ranging from 0 to 10,
which can easily be converted to a percentage.
We constructed the DOF syndromes by apply-
ing exploratory and confirmatory factor-analytic
methodology similar to procedures used for other
ASEBA forms, including the CBCL/6-18, TRF,
YSR, SCICA, and TOF. The factor analyses yielded
five syndromes: Sluggish Cognitive Tempo, Im-
mature/Withdrawn, Attention Problems, Intrusive,
and Oppositional. The Attention Problems syn-
drome was similar to syndromes derived from the
CBCL/6-18, TRF, YSR, SCICA, and TOF. The
Oppositional syndrome was similar to the Oppo-
sitional syndrome on the TOF and the Self-Con-
trol Problems syndrome of the SCICA. The Slug-
gish Cognitive Tempo syndrome was similar to the
2007 Sluggish Cognitive Tempo scales scored from
the CBCL/6-18 and TRF.
In addition to the syndrome scales for classroom
observations, we constructed an Attention Deficit/
Hyperactivity Problems scale comprised of items
consistent with DSM-IV symptoms of ADHD. The
Attention Deficit/Hyperactivity Problems scale has
subscales for Inattention and Hyperactivity-Impul-
sivity. We also constructed an Aggressive Behav-
ior syndrome scale that can be scored from recess
observations. DOF Total Problems can be scored
separately for classroom and recess observations
by summing the 0-1-2-3 ratings for the 89 prob-
lem items.
The DOF scales are normed separately for boys
and girls ages 6-11. We assigned normalized T
scores to raw scores on each scale. The T scores
enable users to compare children with peers across
all scales and to compare a childs standing on each
syndrome with the same childs standing on each
of the other syndromes. To prevent over-interpre-
91
Reliability refers to agreement between repeated
assessments when the phenomena being assessed
are expected to remain constant. The DOF is de-
signed to obtain observers ratings of childrens
functioning in group settings. To assess the reli-
ability of such observations, it is important to know
the degree to which two observers obtain similar
results for the same child in the same observation
period, i.e., the degree of inter-rater reliability. We
present inter-rater reliability between pairs of ob-
servers for classroom observations of 212 children
and for recess observations of 17 children.
It is also important to know the degree to which
observers obtain similar results over periods when
childrens behavior is not expected to change much,
i.e., test-retest reliability. In this chapter, we present
test-test reliability for DOFs completed for two
separate sets of observations of 27 children over
intervals averaging 12 days.
Some users may be interested in the internal
consistency of the DOF scales. This refers to the
correlation between half of a scales items and the
other half of the items. We report Cronbachs
(1951) alpha as a measure of internal consistency
for each DOF scale for separate samples of referred
children and control children in the same settings.
For direct observations of behavior, reliability
coefficients >.70 are generally considered good for
low-stakes screening or program evaluation, while
coefficients closer to .90 are desirable for high-
stakes eligibility or diagnostic decisions
(Chafouleas, Christ, Riley-Tillman, Briesch, &
Chanese, 2007; Hintze & Matthews, 2004). In
terms of effect sizes, Cohen (1988) considers
Pearson rs of .10 to .29 small, .30 to .49 medium,
and >.50 large.
INTER-RATER RELIABILITY
To assess inter-rater reliabilities for classroom
observations, pairs of trained observers used the
DOF to rate one to four 10-minute observations of
212 randomly selected children in elementary
school classrooms in Pennsylvania, New York, and
Vermont. The sample of 212 children included 112
boys and 100 girls, ages 6-11. Of these, 58 chil-
dren were rated by five pairs of observers in greater
Philadelphia, Pennsylvania; 91 children were rated
by four pairs of observers in greater Syracuse, New
York; and 63 children were rated by three pairs of
observers in greater Burlington, Vermont, for a to-
tal of 12 observer pairs. For training, each pair of
observers simultaneously rated five practice cases
to learn the DOF procedures, as described in Chap-
ter 4. Following training, the observer pairs inde-
pendently used the DOF to simultaneously rate 14
to 24 anonymously selected children. Observers
were instructed not to discuss their ratings with
each other until after all reliability data were col-
lected. The number of observation periods per child
varied across observer pairs. Nine observer pairs
completed one DOF per child per observer, while
three observer pairs completed 2 to 4 DOFs per
child per observer.
To assess inter-rater reliabilities for recess ob-
servations, one pair of trained observers used the
DOF to rate two 10-minute observations during
recess (and lunch) for 17 anonymously selected
children (14 boys and 3 girls) in a Vermont school
for children with behavioral/emotional disorders.
When multiple observations were obtained per
Chapter 7
Reliability of the DOF
7. Reliability of the DOF 92
child, we averaged the 0-1-2-3 ratings across DOFs
to obtain an average rating for each of the 88 items
for each observer. We then summed the average
ratings for relevant items to obtain raw scores for
each DOF problem scale. We also averaged On-
task scores across multiple DOFs per child per
observer. When only one DOF was obtained per
child per observer, we summed the 0-1-2-3 ratings
for relevant items to obtain raw scores for each
DOF problem scale and computed the On-task
score per child per observer.
To obtain reliabilities for classroom observa-
tions, we computed Pearson rs between raw scale
scores separately for 10 DOF scales for each of
the 12 observer pairs. Of the 120 Pearson rs for
classroom observations, 106 were significant at p
<.05. We converted each r to Fishers z and then
obtained a mean z for each DOF scale across the
12 observer pairs. We also averaged Fishers z
scores across the six DOF empirically based prob-
lem scales, the three DSM-oriented scales, and all
nine problem scales. We converted the mean z
scores back to r for each DOF scale. We also con-
verted mean z scores back to r to obtain the mean
r of the six empirically based scales, mean r of the
three DSM-oriented scales, and mean r of all nine
problem scales. Inter-rater reliabilities for the Ag-
gressive Behavior syndrome and Total Problems-
Recess were obtained directly for one pair of ob-
servers. Both rs were significant at p <.001.
As can be seen in the first column of Table 7-1,
inter-rater reliabilities for the empirically based
scales ranged from .71 for the Oppositional syn-
drome to .87 for Sluggish Cognitive Tempo and
.88 for Total Problems-Classroom, with a mean r
of .80. For the DSM-oriented scales, inter-rater
reliabilities were .80 for the Attention Deficit/Hy-
peractivity Problems scale, .70 for the Inattention
subscale, and .81 for the Hyperactivity-Impulsiv-
ity subscale, with a mean r of .77. The mean inter-
rater r was .79 across all nine problem scales and
.97 for On-task. For classroom observations, the
inter-rater reliabilities for DOF Total Problems and
On-task scores were consistent with previous find-
ings on earlier versions of the DOF, as discussed
in Chapter 6. For recess observations, inter-rater
reliabilities were .73 for the Aggressive Behavior
syndrome and .97 for Total Problems.
The second column in Table 7-1 shows inter-
rater reliabilities derived from raw scale scores
obtained only on the first 10-minute observation
with the DOF. The correlations were generally simi-
lar to those shown in the first column for scores
averaged across 1 to 4 DOFs: mean r = .78 versus
.79 for all nine problem scales for classroom ob-
servations and mean r = .94 versus .97 for On-task.
8. For the problem scales, seven rs for scores aver-
aged across 1 to 4 DOFs (column 1) were within
.02 r values for scores obtained from 1 DOF (col-
umn 2). To further examine inter-rater reliability
for one versus multiple observations per child, we
computed average rs by Fishers z transformation
for the nine observer pairs who obtained only one
DOF per child versus the three observer pairs who
obtained 2 to 4 DOFs per child. For observer pairs
with only one DOF per child, the mean r was .82
across the nine problem scales for classroom ob-
servations and .94 for On-task. For observer pairs
who obtained 2 to 4 DOFs per child, the mean r
was .75 across the nine problem scales and .99 for
On-task.
The above findings indicate that inter-rater re-
liability was generally similar for observer pairs
obtaining only one 10-minute observation per child
versus multiple 10-minute observations per child.
The small differences between rs are useful to con-
sider for training purposes, since obtaining mul-
tiple DOFs per child is more time and labor inten-
sive than obtaining only one DOF per child. As
can be seen in Table 7-1, for most scales, good
inter-rater reliability can be obtained with only one
10-minute observation per child. Chapter 4 dis-
cusses procedures for training DOF observers.
As described in Chapter 6, revisions of the DOF
entailed adding and testing new items as well as
writing rules for rating various items. We analyzed
findings from various revisions to identify any sig-
nificant effects on inter-rater reliability. Computed
across the 12 observer pairs, we found similar mean
inter-rater reliabilities for DOF Total Problems
scores computed from the 88 items retained on
the 2009 DOF versus Total Problems scores com-
puted from the 96 items of the 1986 DOF (mean r
= .86 versus .83, respectively). We found similar
mean inter-rater reliabilities for the 43 retained
items that had scoring rules versus 45 retained
items without scoring rules (mean r = .81 versus
.76, respectively).
TEST-RETEST RELIABILITY
To assess test-retest reliability, we computed
Table 7-1
Inter-Rater Reliabilities for DOF Scales
Inter-Rater r Inter-Rater r
Scores averaged Scores for first
across 1 to 4 DOFs DOF per child
DOF Scale per child
Sluggish Cognitive Tempo .87 .86
Immature/Withdrawn .79 .73
Attention Problems .72 .74
Intrusive .78 .72
Oppositional .71 .71
Total Problems-Classroom .88 .86
Mean r for empirically based scales .80 .78
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems .80 .80
Inattention subscale .70 .72
Hyperactivity-Impulsivity subscale .81 .80
Mean r for DSM-oriented scales .77 .78
Mean r for all problem scales .79 .78
On-task .97 .94
Recess Observations
Aggressive Behavior .73 .83
Total Problems-Recess .97 .98
Note. N = 212 for classroom observations; N = 17 for recess observations. For classroom observations,
inter-rater rs were obtained for each of 12 pairs of observers. Mean rs were then computed for each scale
by Fisher z transformation. For recess observations, inter-rater rs were obtained from one pair of observers.
Mean rs across sets of scales for classroom and recess observations were computed by Fishers z
transformation.
Pearson rs for DOFs completed for 27 children,
who were rated by the same observer over inter-
vals of 7 to 22 days (average interval = 12.4 days).
The test-retest sample included 19 boys and 8 girls
attending Vermont schools. Ages ranged from 6 to
12 years, with a mean age of 8.4 (S.D. = 1.9). (Only
one child was age 12.) The observer obtained four
10-minute classroom observations for each child
over two days at Time 1 and four 10-minute class-
room observations over two days at Time 2. The
0-1-2-3 item ratings were averaged across the four
Time 1 observation sessions and across the four
Time 2 observation sessions. The averaged item
ratings were then summed to obtain raw scores for
the five DOF syndrome scales, the DSM-oriented
Attention Deficit/Hyperactivity Problems scale, In-
attention and Hyperactivity-Impulsivity subscales,
and Total Problems-Classroom. Averaged raw
scores were also obtained for On-task.
We computed rs between raw scores obtained
for Time 1 versus Time 2 for each DOF problem
scale, plus On-task. Correlations were significant
at p <.05 for 8 of 10 scales. As Table 7-2 shows,
the significant test-retest rs for the empirically
based syndromes ranged from .48 for the Imma-
ture/Withdrawn syndrome to .73 for the Intrusive
syndrome. The test-retest r for Total Problems-
Classroom was .72. Test-retest rs were .76 for the
Problems scale, .43 for the Inattention subscale,
and .77 for the Hyperactivity-Impulsivity subscale.
The mean test-retest rs were .53 across the empiri-
cally based scales, .73 across the DSM-oriented
scales, and .58 across all problem scales. The test-
retest r was .42 for On-task.
Test-retest reliabilities were moderate (.72 to
.77) for the Intrusive syndrome, Total Problems-
Classroom, the DSM-oriented Attention Deficit/
Hyperactivity Problems scale, and the Hyperactiv-
ity-Impulsivity subscale. Test-retest reliability was
low for the Sluggish Cognitive Tempo, Attention
Problems, and Oppositional syndromes, and the In-
attention subscale, suggesting that the problems
comprising these scales are more variable than the
problems comprising the other scales. Test-retest
reliability was also low for On-task scores. The
lower test-retest reliabilities versus higher inter-
rater reliabilities may also be due to the composi-
tion of our samples. The test-retest sample was
comprised only of clinically referred children, some
of whom were in treatment, in contrast to anony-
mously selected control children for the inter-rater
reliabilities.
Pearson r reflects similarities between the rank
orders of scores obtained at Time 1 and Time 2. It
is high when rankings of individuals scores retain
approximately the same rank from Time 1 to Time
2. Because it is not affected by the absolute mag-
nitude of scores, r can be high even if all the Time
1 scores differ in magnitude from the Time 2 scores.
To test differences in mean scores relative to their
variance, we performed dependent t tests of differ-
ences between Time 1 versus Time 2 scores for
each of the 10 DOF scales. As shown in Table 7-2,
Time 1 scores differed significantly (p <.05) from
Time 2 scores only for the Immature/Withdrawn
syndrome, which could be a chance effect (Sakoda,
et al., 1954).
INTERNAL CONSISTENCY
To assess internal consistency of the DOF
scales, we computed Cronbachs alpha (1951) for
each DOF scale. Alpha represents the mean of the
correlations between all possible sets of half the
items comprising a scale. The magnitude of alpha
tends to be directly related to the length of the scale,
because half the items of a short scale provide a
less stable measure than half the items of a long
scale.
Although internal consistency is sometimes re-
ferred to as split-half reliability, it is not reli-
ability in the sense of measuring how well a scale
will produce the same results over different occa-
sions when the target phenomena are expected to
remain constant. Furthermore, some scales with
relatively low internal consistency may be more
valid than other scales with very high internal con-
sistency. As an example, if a scale consists of 20
versions of the same item, it should produce very
high internal consistency, because respondents
should give similar answers to the 20 versions of
the item. However, such a scale would usually be
less valid than a scale that uses 20 different items
to assess the same phenomenon. Because each of
the 20 different items is likely to tap different as-
pects of the target phenomenon and to be subject
to different errors of measurement, the 20 differ-
ent items are likely to provide better measurement
despite lower internal consistency than a scale that
uses 20 versions of a single item.
As detailed in Chapter 6, the DOF syndrome
scales were derived from factor analyses of the
correlations among items. The composition of the
syndrome scales is therefore based on internal con-
sistencies among certain subsets of items. Mea-
sures of internal consistency of the syndrome scales
are thus somewhat redundant with the inter-item
correlations on which the scales were based. By
contrast, the DOF DSM-oriented scales were de-
veloped a priori, based on experts ratings of how
consistent items are with a DSM-IV diagnosis of
Table 7-2
Test-Retest Reliabilities, Means, and Standard Deviations for DOF Scales
Test- Time 1 DOF Time 2 DOF
DOF Scale Retest r Mean SD Mean SD
Sluggish Cognitive Tempo (.25) 1.09 .84 1.08 .86
Immature/Withdrawn .48 .69
b, c
.67 .42 .52
Attention Problems (.35) 8.30 1.68 8.00 1.66
Intrusive .73 2.02 1.56 2.37 1.35
Oppositional .49 2.23 1.48 2.37 1.72
Total Problems-Classroom .72 16.58 4.76 16.34 4.70
Mean r empirically based scales
a
.53
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems .76 10.63 3.00 10.13 2.69
Inattention subscale .43 3.77 1.19 3.42 1.25
Hyperactivity-Impulsivity subscale .77 6.86 2.18 6.72 1.78
Mean r for DSM-oriented scales
a
.73
Mean r for all problem scales
a
.58
On-task .42
c
8.53 1.12 8.65 .91
Note. N = 27. All test-retest observations were done in classrooms. Mean test-retest interval = 12.4 days.
All significant Pearson rs were p <.05. Values in parentheses were not significant.
a
Mean r was computed by Fishers z transformation.
b
Time 1 DOF > Time 2 DOF, p <.05.
c
Not significant when corrected for the number of comparisons (Sakoda, et al., 1954).
ADHD (for details, see Chapter 6). Consequently,
internal consistencies for the DSM-oriented scales
are not redundant with inter-item correlations from
factor analyses.
As shown in Table 7-3, we computed alphas,
derived separately from classroom observations of
332 children and from recess observations of 248
children. The samples included equal numbers of
referred children and randomly selected control
children of the same gender in the same settings.
The classroom sample included 224 boys and 108
girls ages 6-11 (mean age = 8.3, SD = 1.7). The
recess sample included 174 boys and 74 girls ages
6-12 (mean age = 8.4, SD = 1.6; only two referred
children were age 12). For classroom observations,
alphas ranged from .49 to .80 for the five empiri-
cally based syndromes, .68 to .81 for the three
DSM-oriented scales, and .87 for Total Problems.
For recess observations, alphas were .56 for Ag-
gressive Behavior and .70 for Total Problems.
SUMMARY
For classroom observations, the mean inter-rater
r was .80 across the five empirically based syn-
dromes and Total Problems and .77 across the three
DSM-oriented scales, with an overall mean r of
.79 across all DOF problem scales. The r of .97 for
the DOF On-task score and .88 for Total Problems
showed high inter-rater reliability, consistent with
prior research. For recess observations, the inter-
rater r was .73 for Aggressive Behavior and .97
for Total Problems.
To assess test-retest reliability, a trained observer
completed four DOFs at Time 1 and four DOFs at
Time 2 for 27 children observed in classrooms at
Table 7-3
Cronbachs Alpha Coefficients (Internal Consistency) for DOF Scales
DOF Scale Alpha
Sluggish Cognitive Tempo .49
Immature/Withdrawn .76
Attention Problems .67
Intrusive .80
Oppositional .69
Total Problems-Classroom .87
Mean alpha for empirically based scales
a
.73
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems .81
Inattention subscale .68
Hyperactivity-Impulsivity subscale .72
Mean alpha for DSM-oriented scales
a
.74
Recess Observations
Aggressive Behavior .56
Total Problems-Recess .70
Note. Cronbachs alpha was computed for matched samples of referred children and control children in
the same settings. For classroom observations, N = 332; for recess observations, N =248.
a
Mean alpha computed by Fishers z transformation.
97
Validity refers to the accuracy with which in-
struments assess what they are supposed to assess.
The DOF is designed to measure independent ob-
servations of childrens behavioral, emotional, and
social problems in group settings. Data obtained
from the DOF are intended to mesh with data from
other sources, particularly ASEBA instruments for
obtaining parent reports (CBCL/6-18), teacher re-
ports (TRF), self-reports (YSR), clinical interviews
(SCICA), and test session observations (TOF). Like
other ASEBA instruments for assessing behavioral
and emotional problems, the validity of the DOF
must be evaluated in relation to a variety of crite-
ria, none of which is definitive by itself. In this
chapter, we present evidence for the content valid-
ity and criterion-related validity of the DOF.
CONTENT VALIDITY OF DOF ITEMS
The most basic kind of validity is content va-
lidity, which is the degree to which an instruments
content includes what the instrument is intended
to measure. The DOF items were modeled on
CBCL/6-18 and TRF items that are appropriate for
direct observations. The TRF includes 97 items
paralleling those of the CBCL/6-18, plus additional
items appropriate to school settings. Nearly all the
CBCL/6-18 and TRF items discriminated signifi-
cantly (p <.01) between referred and nonreferred
children (Achenbach & Rescorla, 2001).
Beginning in the 1960s, ASEBA problem items
were developed and refined on the basis of research
and practical experience (Achenbach, 1966;
Achenbach & Lewis, 1978). The procedures for
selecting ASEBA problem items included exami-
nation of child/adolescent psychiatric case histo-
ries, extensive literature searches, and consultation
with mental health professionals and special edu-
cators. Pilot editions were tested at multiple sites
and revised on the basis of feedback from parents,
paraprofessionals, and clinicians. Details of the
rationale and procedures for selecting ASEBA
items have been presented in previous manuals for
the instruments (Achenbach, 1991a, b, c;
Achenbach & Edelbrock, 1983, 1986, 1987).
As explained in Chapter 6, the original 1981
and 1986 DOF had 96 problem items, plus an open-
ended item for describing and rating problems not
specified on the DOF. To assemble the original
DOF item sets, Achenbach selected items from
early versions of the CBCL and TRF describing
problems that might be directly observed in group
settings. Whenever possible, he retained the origi-
nal wording of the CBCL and TRF items, but re-
worded some items slightly to make them more
appropriate for direct observations. For the 2003
research edition of the DOF, we retained 95 of the
1986 DOF items and added 19 new items to corre-
spond to TOF items and to tap DSM-IV and DSM-
IV-TR symptoms for ADHD and other problems
that were not covered by the original DOF items.
Through analyses described in Chapter 6, we
reduced the item set for the 2009 version of the
DOF to 88 specific items, plus one open-ended item
for other problems. Of the 88 specific items on the
DOF, there are the following counterpart items on
other ASEBA forms: 51 on the CBCL/6-18, 63 on
the TRF, 49 on the YSR, 69 on the TOF, 60 on the
SCICA-Observation Form, and 35 on the SCICA
Self-Report Form. The content validity of the DOF
items is thus strongly supported by nearly four de-
cades of research, consultation, feedback, and re-
finement of comparable ASEBA items. In addition,
63% of the DOF items significantly discriminated
between clinically referred and control children,
Chapter 8
Validity of the DOF
8. Validity of the DOF 98
as described in the next section.
CRITERION-RELATED VALIDITY
Criterion-related validity refers to the degree
of association between a particular measure, such
as a scale scored from the DOF, and an external
criterion for characteristics that the scale is intended
to measure. One of the main reasons for deriving
syndrome scales from ASEBA forms was the lack
of an empirically based taxonomy of child psycho-
pathology (Achenbach & McConaughy, 1997;
Achenbach & Rescorla, 2001). The ASEBA syn-
drome scales provide empirically based groupings
of items that describe childrens behavioral and
emotional problems, as reported by key informants.
In a similar fashion, the DOF items describe be-
havioral and emotional problems that can be ob-
served in group settings, such as classrooms and
school playgrounds. The DOF syndromes were
derived to provide empirically based scales for
scoring groups of these problem items that tend to
co-occur. The DOF DSM-oriented Attention Defi-
cit/Hyperactivity Problems scale was developed for
scoring problems consistent with a the DSM-IV
and DSM-IV-TR diagnosis of ADHD.
An important way to test criterion-related va-
lidity of the DOF items and scales is to measure
their ability to discriminate between children who,
independently of their DOF scores, have been
judged to be at risk for emotional or behavioral
problems and have been referred for mental health
or special education evaluations and/or services.
We recognize that clinical referral is not an infal-
lible criterion of need for help. Some children in
our referred samples may not have needed profes-
sional help, while others in our control sample
may have needed help. However, actual referral
status is as ecologically valid as any other practi-
cal alternative for testing criterion-related valid-
ity.
Matched Referred and Control Samples
To test the criterion-related validity of DOF
items and scale scores, we used matched samples
of clinically referred children and randomly se-
lected control children in the same settings. From
the samples shown earlier in Table 6-1, we selected
6- to-12-year-olds who had been referred for evalu-
ation of behavioral and emotional problems and/
or learning difficulties and had participated in our
research studies using the DOF (for brevity, we call
this group referred.). For each referred child, in-
dependent observers selected one or two control
children of the same gender and in the same class-
room as the referred child. For classroom observa-
tions, the matched samples included 166 referred
children ages 6-11 and 263 control children. (For
classroom observations, 97 referred children had
two matched controls). For recess observations, the
matched samples included 124 referred children
ages 6-12 and 248 control children. (For recess ob-
servations, all referred children had two matched
controls; only two referred children were age 12.)
Referred children with full scale IQ scores <75
were excluded from both samples.
Table 8-1 shows the characteristics of the
matched samples of boys and girls for classroom
observations (N = 430) and recess observations (N
= 372). Ethnicity of the sample for classroom ob-
servations was 85.2% non-Latino White and 14.8%
other ethnicities. Ethnicity of the sample for re-
cess observations was 100% non-Latino White.
MANCOVA of DOF Item Ratings
To test associations of referral status and de-
mographic variables with DOF item ratings for
classroom observations, we used a multivariate
analysis of covariance (MANCOVA) to analyze
DOF item ratings obtained by the matched samples
shown in Table 8-1. For classroom observations,
the MANCOVA design was 2 (referred vs. con-
trols) x 2 (boys vs. girls), with ethnicity (non-Latino
White vs. Other) as a covariate. For recess obser-
vations, we used a 2 (referred vs. controls) x 2 (boys
vs. girls) MANOVA, with no covariate because
ethnicity for that sample was 100% non-Latino
White. For each of the 88 DOF items, we aver-
aged 0-1-2-3 ratings across 2 to 4 DOFs separately
for each referred child and each control child. To
create equal sample sizes for referred and control
8. Validity of the DOF
99
children, we computed the mean of the averaged
item ratings when there were two control children.
For classroom observations, the dependent vari-
ables for the MANCOVA were mean item ratings
for 166 referred children and 166 averaged ratings
for controls. For recess observations, dependent
variables for the MANOVA were mean item rat-
ings for 124 referred children and 124 averaged
ratings for controls. (For recess observations, ob-
servers used the 1986 version of the DOF, from
which 72 items were retained on the 2009 DOF.)
Because we found no significant differences on
DOF Total Problems for younger (ages 6 to 8) ver-
sus older (ages 9 to 12) children, we did not in-
clude age in the MANCOVA or MANOVA designs.
Socioeconomic status (SES) was also not included
as a covariate because SES was not available for
the control sample.
The overall MANCOVA for classroom obser-
vations showed significant effects of referral sta-
tus, gender, and ethnicity (p < .01), but no signifi-
cant referral status x gender interaction. The over-
all MANOVA for recess observations showed sig-
nificant effects of referral status and gender (p <
.01), but no significant interaction. The first three
columns of Table 8-2 display significant effect sizes
(ES) of referral status, gender, and ethnicity for each
of the 88 specific problem items on the DOF, as
obtained from subsequent ANCOVAs of classroom
observations. The last two columns of Table 8-2
display significant ES of referral status and gender
obtained from subsequent ANOVAs of recess ob-
servations. The ES is represented by the percent of
variance (partial Eta
2
) uniquely accounted for by
each independent variable that was significant at p
<.05. According to Cohens (1988) criteria for ES
in ANCOVA/ANOVA, effects accounting for 1-
5.8% of variance are small; effects accounting for
5.9-13.7% of variance are medium; and effects ac-
counting for >13.8% of variance are large. The ES
Table 8-1
Characteristics of Matched Samples of Referred and Control Children
Boys Girls Total
Control children 179 85 264
Total 291 139 430
Recess Observations
Control children 174 74 248
Total 261 111 372
Ethnicity for Classroom Sample
a
Mixed or Other 4.0%
a
Percentages for N = 425 for classroom observations; ethnicity for recess observations was 100% non-
Latino White.
Table 8-2
Percent of Variance Accounted for by Significant (p<.05) Effects of Referral Status and
Demographic Variables on DOF Item Scores in ANCOVAs
Classroom Observations Recess Observations
Ref Ref
DOF Item Status
a
Gender
b
Ethnicity
c
Status
a
Gender
b
1. Acts too young for age 5 1
W, d
2
d
2. Makes odd noises 2 3

B, d

3. Argues 1
d
2
O, d
2
d
2
B, d
4. Cheats 2
O

5. Defiant or talks back to staff 4
O

6 Brags, boasts 3
7. Doesnt concentrate or doesnt pay attention
for long 6
8. Difficulty waiting turn in activities or tasks 2
9. Doesnt sit still, restless, orhyperactive 4 2
B, d
2
W

10. Clings to adults or too dependent 4
11. Confused or seems to be in a fog 1
12. Cries 1
O

13. Fidgets, including with objects 7 3
G, d
14. Cruel, bullies, or mean to others 2
G, d
3
15. Daydreams or gets lost in thoughts 1
d

16. Difficulty following directions 3 2
O, d

17. Tries to get attention of staff 5
O
2
d
18. Destroys own things

19. Destroys property belonging to others 2
O, d

20. Disobedient 3
O
3
21. Disturbs other children 5
O
2
22. Doesnt seem to feel guilty after misbehaving 3
O
3
23. Doesnt seem to listen to what is being said 6 3
O

24. Eats, drinks, chews, or mouths things that
are not food, excluding junk foods (describe): 1
d
2
G, d

25. Difficulty organizing activities or tasks 5
26. Fails to give close attention to details 2
27. Forgetful in activities or tasks 2
28. Out of seat 2
29. Gets hurt, accident prone 5 2
G, d
30. Gets in physical fights 4
31. Gets teased 3
32. Interrupts 2
O

33. Impulsive or acts without thinking,
including calling out in class 5
34. Physically isolates self from others 1
d
2
d
2
B, d
35. Lies
101
36. Bites fingernails
37. Nervous, highstrung, or tense 2
38. Nervous movements, twitching, tics or other
unusual movements (describe): 2
39. Loses things
40. Too fearful or anxious 1
d

41. Physically attacks people 4 3
B, d
42. Picks or scratches nose, skin, or other parts
of body (describe): 4
B, d
1
O, d

43. Runs about or climbs excessively
44. Apathetic, unmotivated, or wont try 2 2
O, d

45. Responds before instructions are completed
46. Disrupts group activities 2
47. Screams 4
G, d
48. Secretive, keeps things to self, including
refusal to show things to teacher
49. Avoids or is reluctant to do tasks that require
sustained mental effort 1
d

50. Self-conscious or easily embarrassed 2
G, d

51. Slow to respond verbally 2
52. Shows off, clowns, or acts silly 2 2
d
53. Shy or timid 1

d
4
G, d

54. Explosive or unpredictable behavior 2
d
55. Demands must be met immediately,

easily frustrated
56. Easily distracted by external stimuli 5
O

57. Stares blankly 3 3
O
2
58. Speech problem (describe):
59. Wants to quit or does quit tasks 1
60. Yawns
61. Strange behavior (describe): 2
63. Sulks
65. Talks too much 3
66. Teases 2
O

68. Threatens people 2
O
2
d

Table 8-2 (cont.)
Ref Ref
DOF Item Status
a
Gender
b
Ethnicity
c
Status
a
Gender
b
70. Underactive, slow moving, tired, or lacks
energy 1
71. Unhappy, sad, or depressed 2
72. Unusually loud 4
74. Whining tone of voice 3
G, d
75. Withdrawn, doesnt get involved with others 2 8
76. Sucks thumb, fingers, hand, or arm
77. Fails to express self clearly 2
78. Impatient
79. Tattles 3
80. Repeats behavior over & over;
compulsions (describe):
81. Easily led by peers 3
O

82. Clumsy, poor motor control 3
83. Doesnt get along with peers 3
86. Bossy 2
G, d
2
d
87. Complains 1
d
3
O
3 3
G, d
88. Afraid to make mistakes 2
Note. For classroom observations: N = 166 referred children ages 6-11 and 264 matched controls in the
same classrooms. Analyses were referral status x gender MANCOVA and ANCOVAs with ethnicity
(Non-Latino White vs. Other) as a covariate. For recess observations: N = 124 referred children ages 6-
12 and 248 matched controls in the same setting. Analyses were referral status x gender MANOVA and
ANOVAs with no covariate. The percent of variance uniquely accounted for by each independent variable
is represented by partial Eta
2
. Scores were item raw scores averaged across 2 to 4 DOFs per child.
a
All significant effects of referral status reflected higher scores for referred than control children.
b
B = boys scored higher; G = girls scored higher.
c
W = Non-Latino White scored higher; O = Other scored higher.
d
Not significant when corrected for number of analyses.
in Table 8-2 are values for partial Eta
2
rounded to
the nearest whole number. The superscript d in the
table indicates effects that could be regarded as
significant by chance when corrected for the num-
ber of analyses for each independent variable, us-
ing a p <.05 protection level (Sakoda et al., 1954).
Table 8-2 (cont.)
Ref Ref
DOF Item Status
a
Gender
b
Ethnicity
c
Status
a
Gender
b
Referral Status Effects. For classroom obser-
vations, referred children scored significantly
higher (p <.05) than control children on 38 of the
88 DOF items. Of these, eight effects could have
occurred by chance, which are marked by the su-
perscript d in Table 8-2. Three DOF items showed
103
Scale Scores
To test associations of referral status and de-
mographic characteristics with DOF scale scores
for classroom observations, we performed multiple
regressions on raw scores for each scale (the de-
pendent variable) with the independent variables
of referral status, gender, and ethnicity (non-Latino
White versus Other). For multiple regressions of
recess observations, the independent variables were
referral status and gender. To obtain raw scores for
each problem scale, we first averaged item ratings
across 2 to 4 DOFs separately for each referred
child and each control child. We then computed
the mean of the averaged item ratings when there
were two control children, as done for the
MANCOVA and MANOVA of item ratings. Raw
scores for the DOF scales were the sums of the
averaged ratings for items comprising each scale.
For classroom observations, the raw score for To-
tal Problems was the sum of the averaged item rat-
ings for the 88 specific problem items, plus the
open-ended item for additional problems. For re-
cess observations, the raw score for Total Prob-
lems was the sum of the averaged item ratings for
the 72 specific problem items retained from the
1986 DOF, plus the open-ended item for additional
problems. For On-task, we averaged the 0 to 10
scores across 2 to 4 DOFs separately for each re-
ferred child and each control child, and then com-
puted the mean of the averaged On-task scores
when there were two control children.
Table 8-3 displays ESs for associations of re-
ferral status and ethnicity with DOF scale scores.
The ES is the squared standardized regression co-
efficient, which reflects the percent of variance in
scale scores that was uniquely accounted for by
each independent variable. According to Cohens
(1988) criteria for ES in multiple regressions, ef-
fects accounting for 2-12% of variance are small;
effects accounting for 13-25% of variance are me-
dium; and effects accounting for >26% of variance
are large. The superscript c in the table indicates
effects that could be regarded as significant by
chance when corrected for the number of analyses
for each independent variable, using a p <.05 pro-
medium ES for referral status: 7. Doesnt concen-
trate or doesnt pay attention for long (6%); 13.
Fidgets (7%); and 23. Doesnt seem to listen to what
is being said (6%). The remaining 35 significant
ES were small, accounting for 1-5% of variance.
For recess observations, referred children scored
significantly higher (p <.05) than control children
on 24 of 67 DOF items. (Five items were scored 0
for 100% of cases and 16 items had missing val-
ues because they were not included on the 1986
DOF used for recess observations.) One item, 75.
Withdrawn, doesnt get involved with others,
showed a medium ES for referral status. All other
ES for recess observations were small, accounting
for 1-5% of variance. Seventeen items showed sig-
nificant ES for recess observations but not class-
room observations. Thus, 55 of 88 (63%) DOF
items showed significant effects of referral status
in classroom observations, recess observations, or
both.
Demographic Effects. The demographic vari-
ables of gender and ethnicity showed several small
ESs (p <.05) on DOF item ratings as follows: For
classroom observations, there were eight small ES
for gender, accounting for 2-4% of variance, all of
which could be due to chance. Boys were rated
higher on three DOF items and girls were rated
higher on five items. There were 22 small ES for
ethnicity, accounting for 1-5% of variance. Of
these, eight could be chance effects. Children with
Other ethnicity (which included African Ameri-
can, Latino/Hispanic, and Mixed or other ethnicity)
were rated higher than non-Latino White children
on 20 DOF items, while non-Latino White chil-
dren were rated higher on two items.
For recess observations, there were eight small
ES for gender, accounting for 1-4% of variance,
all of which could be due to chance. Boys were
rated higher on three DOF items and girls were
rated higher on five items. (As indicated earlier,
ethnicity was not included in analyses of recess
observations because the entire sample was non-
Latino White.)
Multiple Regression Analyses of DOF
tection level (Sakoda et al., 1954).
Referral Status Effects. Referral status effects
outweighed demographic effects on all DOF prob-
lem scales. Referred children scored significantly
(p <.05) higher than control children on all DOF
problem scales, accounting for 4 to 26% of vari-
ance. Control children scored significantly (p <.05)
higher than referred children on On-task (8% of
variance). Of the 12 significant ES for referral sta-
tus, two could have occurred by chance. DOF To-
tal Problems-Recess showed a large ES, account-
ing for 26% variance. DOF Total Problems-Class-
room showed a medium ES, accounting for 13%
of variance. All other ES were small according to
Cohens (1988) criteria. After the two DOF Total
Problems scales, the next highest ES were for the
Table 8-3
Percent of Variance Accounted for by Significant (p <.05) Effects of Referral Status and
Ethnicity on DOF Scale Scores in Multiple Regressions
DOF Scale Referral Status
a
Ethnicity
b
Sluggish Cognitive Tempo 6
c
1
O, c
Immature/Withdrawn 6
Attention Problems 8
Intrusive 4
c
3
O
Oppositional 7 3
O
Total Problems-Classroom 13 2
O
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems 10 1
O, c
Inattention subscale 8 2
O
Hyperactivity-Impulsivity subscale 9
On-task 8 6
W
Recess Observations
Aggressive Behavior 10 N/A
Total Problems-Recess 26 N/A
Note. For classroom observations: N = 166 referred children ages 6-11 and 264 matched controls in the
same classrooms. For recess observations: N = 124 referred children ages 6-12 and 248 matched controls
in the same setting. For classroom observations, analyses were multiple regressions of raw scale scores
on referral status, gender, and ethnicity. For recess observations, analyses were multiple regressions of
raw scale scores on referral status and gender. Percent of variance is represented by the squared standardized
regression weight for each independent variable. There were no significant gender effects on any DOF
scale.
a
Referred children scored significantly (p <.05) higher than control children on all problem scales; control
children scored significantly (p <.05) higher than referred children on On-task.
b
O = Other scored higher than non-Latino White; W = non-Latino White scored higher than Other.
c
Not significant when corrected for the number of analyses.
105
Mean Scale Scores for Referred and
Control Children
Table 8-4 displays the mean raw scores and stan-
dard deviations obtained by referred and control
children on each DOF scale, derived from
MANCOVA and ANOVA. MANCOVAs were
modeled on the regression analyses, treating refer-
ral status and gender as between subject measures
and ethnicity as a covariate. These included a 2
(referred vs. control) x 2 (boys vs. girls)
MANCOVA on raw scale scores for the five DOF
syndromes, followed by univariate 2 x 2 ANOVAs
on scores for each syndrome scale. We performed
a similar 2 x 2 MANCOVA, followed by univariate
ANOVAs, on the Inattention and Hyperactivity-Im-
pulsivity subscales, and 2 x 2 univariate ANOVAs
on the Attention Deficit/Hyperactivity Problems
scale, Total Problems-Classroom, and On-task
scores. ANOVAs for the Aggressive Behavior syn-
drome and Total Problems-Recess were also mod-
eled on the multiple regressions, treating referral
status and gender as between subject measures with
no covariate. The results mirrored those of the mul-
tiple regressions. Referred children scored signifi-
cantly (p <.05) higher than control children on all
DOF problem scales, while control children scored
significantly (p <.05) higher than referred children
on On-task. There were no significant gender ef-
fects.
Discriminant Analyses of DOF Scale
Scores
We used discriminant analyses to determine
which weighted combinations of DOF scale scores
best differentiated referred from control children
for the matched samples shown in Table 8-1. When
there were two matched control children for a re-
ferred child, we averaged item ratings across the
two controls, as done in previous analyses. For
classroom observations, we performed one dis-
criminant analysis using the five DOF syndromes
as candidate predictors and another discriminant
analysis using the DSM-oriented Inattention and
Hyperactivity-Impulsivity subscales as candidate
predictors. Predictors were entered simultaneously
within each set of discriminant analyses.
Discriminant analyses selectively weight pre-
dictors to maximize their collective associations
with the criterion groups being analyzed. The
weighting process makes use of characteristics of
the sample that may differ from other samples. To
avoid overestimating the accuracy of the classifi-
cation obtained by discriminant analyses, it is nec-
essary to correct for shrinkage in associations that
would occur when discriminant weights derived
in one sample are applied to a new sample. To cor-
rect for shrinkage, we employed a jackknife
(cross-validation) procedure whereby discriminant
functions are computed with a different childs data
excluded (held out) of the sample each time. Each
discriminant function is then cross-validated by
testing the accuracy of its predictions for the child
who was held out when the discriminant function
was computed. Finally, the percentage of correct
predictions is averaged across all the held-out chil-
dren.
In addition to discriminant analyses of sets of
DOF scales, we obtained cross-validated percent-
ages of cases correctly classified by the DOF Total
Problems-Classroom, Attention Deficit/Hyperac-
tivity Problems scale, Aggressive Behavior, and To-
tal Problems-Recess as single predictors.
For each set of predictors, Table 8-5 shows the
cross-validated percentages of children correctly
classified as referred (sensitivity) versus controls
(specificity). The weighted combination of the five
syndrome scales correctly classified 56% of re-
ferred children and 74% of control children, with
an overall correct classification rate of 65% and
overall misclassification rate of 35%. An additional
forward stepwise discriminant analysis indicated
that all but the Intrusive syndrome were signifi-
cant (p <.05) predictors in the discriminant func-
tion. The Sluggish Cognitive Tempo syndrome was
the strongest predictor (standardized canonical co-
efficient = .478), with the other three syndromes
contributing about equally to the discriminant func-
tion (standardized canonical coefficients = .340
to .389). The DOF Total Problems-Classroom score
alone showed similar classification rates: 54% of
referred children and 75% of control children cor-
rectly classified, with an overall correct classifi-
cation rate of 65% and overall misclassification
rate of 35%.
The weighted combination of the DSM-oriented
subscales correctly classified 53% of referred chil-
dren and 73% of control children, with an overall
correct classification rate of 63% and overall
misclassification rate of 37%. An additional for-
ward stepwise discriminant analysis indicated that
both the Inattention and Hyperactivity-Impulsiv-
ity subscales were significant (p <.05) predic-
tors, with Hyperactivity-Impulsivity contributing
slightly more (standardized canonical coefficient
= .610) than Inattention (standardized canonical
coefficients = .508). The Attention Deficit/Hyper-
activity Problems scale alone showed the same
overall correct classification rate of 63% and
misclassi-fication rate of 37%.
Table 8-4
Means and Standard Deviations of DOF Raw Scale Scores for Referred and Control Children
Referred
a
Averaged Controls
b
DOF Scale Mean SD Mean SD
Sluggish Cognitive Tempo 1.2 1.3 0.7 0.8
Immature/Withdrawn 0.8 1.8 0.2 0.4
Attention Problems 5.6 3.0 4.1 2.6
Intrusive 1.9 3.0 1.0 1.4
Oppositional 1.9 1.9 0.9 1.8
Total Problems-Classroom 13.6 8.8 8.0 5.4
DSM-Oriented Scales
Attention Deficit/Hyperactivity Problems 8.2 5.7 5.0 3.6
Inattention subscale 2.8 2.6 1.5 1.8
Hyperactivity-Impulsivity subscale 5.4 3.7 3.5 2.4
On-task 8.0 1.7 8.9 1.6
Recess Observations
Aggressive Behavior 1.4 1.6 0.5 0.6
Total Problems-Recess 4.2 3.2 1.3 1.4
Note. For classroom observations: N = 166 referred children ages 6-11 and averaged ratings for 264
matched controls in the same classrooms. For recess observations: N = 124 referred children ages 6-12
and averaged ratings for 248 matched controls in the same setting. Problem scale scores were the sums
of averaged item ratings.
a
Referred children scored significantly (p <.05) higher than control children on all problem scales.
b
Control children scored significantly (p <.05) higher than referred children on On-task.
107
The Aggressive Behavior syndrome, based on
recess observations, correctly classified 51% of re-
ferred children and 80% of control children, with
an overall correct classification rate of 65% and
overall misclassi-fication rate of 35%. Total Prob-
lems-Recess alone produced the best classification
rates, correctly classifying 64% of referred chil-
dren and 90% of control children, with an overall
correct classification rate of 77% and overall
misclassification rate of 23%.
SUMMARY
This chapter presented several kinds of evidence
for the validity of the 2009 DOF items and scale
scores. Content validity of the DOF items is based
on their derivation from similar items of the CBCL/
6-18 and TRF, most of which significantly dis-
criminated referred from nonreferred children
(Achenbach & Rescorla, 2001).
Criterion-related validity was supported by the
ability of the DOF items and scale scores to dis-
criminate between matched samples of referred and
nonreferred control children. Referred children
scored significantly higher on 55 of the 88 DOF
items for observations in classrooms and/or recess,
with referral status accounting for 1 to 7% of vari-
ance. Demographic variables of gender and
ethnicity showed small effects on item scores. Re-
ferred children scored significantly higher than
nonreferred children on all DOF problem scales,
accounting for 4 to 26% of variance. DOF Total
Problems accounted for 13% of variance in class-
room observations and 26% of variance in recess
observations. Control children scored significantly
higher on DOF On-task, accounting for 8% of vari-
ance.
A weighted combination of the five DOF syn-
dromes correctly classified 56% of referred chil-
dren (sensitivity) and 74% of nonreferred children
(specificity). A weighted combination of the DSM-
oriented Inattention and Hyperactivity-Impulsiv-
ity subscales showed only slightly lower sensitiv-
Table 8-5
Cross-Validated Percents of Cases Correctly Classified as Referred vs. Control
Overall
Averaged Correct
Candidate Predictors Referred Controls Classification
Five syndrome scales 56% 74% 65%
Total Problems-Classroom 54% 75% 65%
DSM-oriented Inatttention & Hyperactivity-
Impulsivity subscales 53% 73% 63%
Problems 54% 72% 63%
Recess Observations
Aggressive Behavior 51% 80% 65%
Total Problems-Recess 64% 90% 77%
Note. For classroom observations: N = 166 referred children ages 6-11 and averaged ratings for 264
matched controls in the same classrooms. For recess observations: N = 124 referred children ages 6-12
and averaged ratings for 248 matched controls in the same setting. Scale scores were the sums of
averaged item ratings for referred and control children.
108
Chapter 9
Answers to Frequently Asked Questions
This chapter answers questions that may arise
about the DOF. The questions are grouped under
headings pertaining to the DOF form and profile,
applications of the DOF, relations to other assess-
ment procedures, coordinating data from multiple
sources, and relations to DSM and special educa-
tion classifications. If you have a question that is
not answered under one heading, look under the
other headings. The Table of Contents and Index
may also help you find answers to questions not
listed in this chapter.
FEATURES OF THE DOF
1. What is the DOF?
Answer: The Direct Observation Form (DOF)
is a standardized form for rating observations of
6- to-11-year-old children in school classrooms, at
recess, and in other group settings. During a 10-
minute period, the observer uses the DOF to write
a narrative description of the childs behavior, af-
fect, and interactions. The observer also rates the
child for being on-task or off-task for 5 seconds at
the end of each 1-minute interval. At the end of
the 10-minute observation, the observer rates the
child on 89 problem items using a 0-1-2-3 scale.
Chapter 2 of this Manual provides detailed instruc-
tions for using the DOF and rating the DOF items.
2. How many 10-minute observations are
necessary to score the DOF?
Answer: The DOF Module for computer scor-
ing requires at least two DOFs (i.e., two 10-minute
observations) and allows up to six DOFs per child
to score one DOF Profile. Because childrens be-
havior can vary from one occasion to another, we
recommend 3 to 6 observations of the identified
child on at least two different days. Observers may
also include 1 to 6 DOFs for each of two control
children matched to the identified child. Observa-
tions of control children are recommended but
optional.
3. Why are control children included on
the DOF?
Answer: Observations of control children pro-
vide a standard for evaluating the behavior of the
identified child in relation to peers in the same situ-
ation. Observers do not need to know the names of
control children. Observers should select a control
child of the same gender who is situated far enough
away so as not to influence the behavior of the iden-
tified child, if possible. The DOF Module for com-
puter scoring allows up to two control children per
identified child to score one DOF Profile.
4. What if it is not possible to match the
gender of a control child to the gender
of the identified child?
Answer: Although rare, this may occur in some
settings or special programs where one gender
vastly outnumbers the other. We recommend
matching the gender of control children to the gen-
der of the identified child because the DOF Profile
has separate norms for boys and girls. However, if
the gender of a control child is different from the
gender of the identified child, the DOF Module
for computer scoring weights the item scores of
that control child in order to derive scale scores
that approximate scores for the correct gender. If
the identified child is a boy and the control child is
a girl, then item scores of the control child are ad-
justed upward. If the identified child is a girl and
the control child is a boy, item scores of the con-
trol child are adjusted downward. These adjust-
ments are done only for DOF Profiles for class-
room observations, because there were no signifi-
cant gender differences for recess observations, as
reported in Chapter 8.
9. Answers to Frequently Asked Questions
109
5. What is the DOF Profile?
Answer: The DOF Profile is a computer-scored
display of item and scale scores from classroom
observations and/or recess observations. The user
must select one setting (class or recess) for com-
puter-scoring. The DOF can only be scored by com-
puter because of the complexity of averaging item
scores across multiple observation sessions. As
discussed in Chapter 3, the DOF Profile for class-
room observations displays averaged item scores
plus raw scores, T scores, and percentiles for five
empirically based syndrome scales, a DSM-ori-
scale with Inattention and Hyperactivity-Impulsiv-
ity subscales, Total Problems, and an On-task score.
The DOF Profile for recess observations displays
averaged item scores plus raw scores, T scores and
percentiles for an empirically based Aggressive
Behavior syndrome scale and Total Problems. Pro-
files for both settings also display averaged item
scores for Other Problems not scored on the syn-
drome scales. The DOF Profile has separate norms
for boys and girls ages 6 to 11.
6. What are the DOF syndrome scales?
Answer: As detailed in Chapter 6, the DOF syn-
drome scales were derived by factor analyzing av-
eraged scores for the DOF items to identify pat-
terns of co-occurring problems. Each of the five
syndrome scales consists of a set of problem items
that were found to co-occur. The 0-1-2-3 ratings
on each problem item are averaged across multiple
10-minute observations. A childs total score on a
syndrome scale is the sum of the averaged ratings
on the items comprising each scale. For total scores
on each syndrome scale, the DOF Profile indi-
cates standard scores (T scores) and percentiles
based on a normative sample of boys and girls ages
6-11. Classroom observations are scored on the
following five syndrome scales: Sluggish Cogni-
tive Tempo, Immature/Withdrawn, Attention Prob-
lems, Intrusive, and Oppositional. For recess ob-
servations, the one syndrome is designated as Ag-
gressive Behavior.
7. What are the Other Problems?
Answer: The Other Problems are problem
items that were not associated strongly to quality
for any of the syndromes derived from factor analy-
ses. Therefore, they are not included in the syn-
drome scales. However, each of the Other Prob-
lems items may be important in its own right.
There is a different set of Other Problems for
DOF Profiles based on classroom observations
versus recess observations. The relevant Other
Problems item set is included along with items
from the syndrome scales when computing scores
for Total Problems-Classroom and Total Problems-
Recess.
8. Why doesnt the DOF Profile display
percentiles or T scores for the Other
Problems?
Answer: The Other Problems do not consti-
tute a separate scale. They are merely the items
that did not qualify for the syndrome scales. There
are thus no specific associations among them to
warrant treating them as a separate scale. How-
ever, each of these problems may be important, and
they are all included in computing Total Problems
scores.
9. How is the open-ended item 89
figured into the scale scores?
Answer: If the observer enters any problems in
item 89, the highest rating that the observer gave
to any of these problems (i.e., 1, 2, or 3) is added
to the Total Problems score.
10. What are the DSM-oriented Attention
Deficit/Hyperactivity Problems scale
and its Inattention and Hyperactivity-
Impulsivity subscales?
Answer: The DSM-oriented Attention Deficit/
Hyperactivity Problems scale and its Inattention
and Hyperactivity-Impulsivity subscales consist of
items that are consistent with the DSM-IV and
DSM-IV-TR diagnostic categories of Attention
Deficit/Hyperactivity Disorder (ADHD). The In-
attention subscale has 10 items and the Hyperac-
tivity-Impulsivity subscale has 13 items. The At-
9. Answers to Frequently Asked Questions 110
tention Deficit/Hyperactivity Problems total score
is the sum of ratings on all 23 items. Twelve of the
23 items were similar to CBCL/6-18 and/or TRF
items that an international panel of experts identi-
fied as being very consistent with the DSM-IV
symptoms of ADHD, as explained in Chapter 6.
Eleven additional items were added to the DOF to
tap ADHD symptoms that were not covered by the
other items.
11. Should raw scores, percentiles, or T
scores be used to report results for
DOF scales?
Answer: Percentiles and T scores are usually
preferable to raw scale scores for reporting find-
ings for individual children, because they indicate
degrees of deviance on each scale in comparison
with the normative sample for the childs gender.
However, for statistical analyses of scale scores,
raw scale scores should be used because the T
scores on all scales, expect Total Problems-Class-
room, are truncated at 50, as explained in Chapter
6. If boys and girls are combined in the same sta-
tistical analyses, it may be useful to assign stan-
dard scores separately for each gender so that scores
for each gender have the same mean and the same
standard deviation. On the other hand, if the sta-
tistical analyses are intended to test gender differ-
ences in raw scale scores, then the scores should
not be standardized by gender.
12. How should high scores on the DOF
problem scales be interpreted?
Answer: The DOF Profile shows the letter B
next to T scores that fall in the borderline clinical
range and the letter C next to T scores that fall in
the clinical range for each of the problem scales.
Scores in the borderline range warrant concern but
are not as clearly deviant at those in the clinical
range. For the syndrome scales and DSM-oriented
scales, T scores >69 (>97
th
percentile) are consid-
ered to be in the clinical range, while T scores of
65 to 69 (93
rd
to the 97
th
percentiles) are consid-
ered to be in the borderline range. For Total Prob-
lems, T scores >63 (>90
th
percentile) are in the clini-
cal range, while T scores of 60 to 63 (84
th
to 90
th
percentiles) are in the borderline range. For cer-
tain purposes, such as screening to identify chil-
dren who are at risk for problems, users may choose
to use lower cutpoints on the problem scales than
those that demarcate the borderline or clinical
range.
13. Should extremely low scores on the
DOF problem scales be considered
deviant?
Answer: No. Extremely low scores on the prob-
lem scales merely reflect an absence of problems
observed for a particular time frame and setting.
Because children may manifest problems that are
concentrated in particular areas, it is not unusual
for profiles to have high scores on some scales but
low scores on other scales. Low scores on the prob-
lem scales do not necessarily mean that problems
are absent in other contexts, such as the home or
other settings in school.
14. How should DOF On-task scores be
interpreted?
Answer: As explained in Chapter 2, observers
rate on-task behavior by marking boxes for on-task
or off-task that represent the last 5 seconds of each
1-minute interval over each 10-minute observation
period. Total On-task scores can thus range from 0
to 10 for each observation. The DOF Module for
computer scoring averages On-task scores across
multiple observation sessions. Low On-task scores
warrant clinical concern, in contrast to high scores
for the problem scales. The DOF Profile displays
mean raw scores, T scores and percentiles for On-
task. T scores <31 (<3
rd
percentile) are considered
to be in the clinical range, while T scores of 31 to
35 (3
rd
to 7
th
percentiles) are in the borderline range.
The On-task mean raw score can also be translated
into a percentage of on-task behavior, as done in
the DOF Narrative Report.
15. How are clinical interpretations of
the DOF Profile made?
Answer: The DOF is designed to provide a stan-
111
dardized description of a childs behavioral and
emotional problems observed in school classrooms,
at recess, or in other comparable group settings.
The T scores and percentiles for the problem scales
and On-task provide a basis for comparing an in-
dividual child to normative samples of peers of the
same gender. The scale scores on the DOF Profile
can also be compared with analogous scale scores
on CBCL/6-18, TRF, YSR, SCICA, and TOF pro-
files to identify similarities and differences between
problems reported by different informants in dif-
ferent situations. Information from all available
sources should then be integrated to form a com-
prehensive picture of the childs functioning, as il-
lustrated in case examples in Chapter 5.
APPLICATIONS OF THE DOF
1. Who should complete the DOF?
Answer: The DOF can be completed by any-
one who has sufficient understanding of the re-
quired observation and rating procedures, as de-
scribed in Chapter 2. Observers can be teachers
aides or other school paraprofessionals, under-
graduate or graduate students, and research assis-
tants, as well as professionals in education, school
psychology, clinical psychology, and related disci-
plines. Paraprofessionals and students should use
the DOF under the supervision of a qualified pro-
fessional who has knowledge of the theory and
methodology of standardized assessment. Chapter
4 provides guidelines for training DOF observers
and conducting school observations.
2. When should the DOF be completed?
Answer: The DOF should be completed imme-
diately after the 10-minute observation for which
it was used. Observers should complete a separate
DOF for each 10-minute observation.
3. Can the DOF be used below age 6 or
above age 11?
Answer: The 2009 version of the DOF was
normed for ages 6 to 11. The DOF may be appro-
priate for younger children in group settings, such
as Kindergarten or preschool, and children older
than age 11. However, the farther the departure
from the 6-11-year-old norms, the less appropriate
the percentiles and T scores may be for interpret-
ing a childs DOF Profile. Researchers who plan
to analyze only DOF raw scores (not T scores) may
choose to use the DOF for observations of chil-
dren outside of the 6 - 11 age range.
4. Can the DOF be used to assess child-
ren who have physical or mental dis-
abilities?
Answer: The DOF provides a standardized de-
scription of observed behavior. If a child has a
physical or mental disability, then observed behav-
ior must be interpreted with this in mind. How-
ever, children with physical and mental disabili-
ties were excluded from the DOF normative
sample. T scores and percentiles on the DOF Pro-
file therefore provide comparisons only to peers
without disabilities.
RELATIONS TO OTHER ASSESSMENT
PROCEDURES
1. Can other procedures for assessing
behavioral and emotional problems be
used with the DOF?
Answer: The DOF obtains samples of behav-
ioral and emotional problems observed during
multiple 10-minute observations of children in
group settings. As explained in Chapter 2, users
can include up to six DOFs in one observation
set for each identified child and up to six DOFs
for each of two control children matched to the
identified child. Scores from one observation set
can then be compared with scores from another
observation set for the same identified child. For
example, you might choose to compare DOF scores
for observations completed at the beginning of the
school year (e.g., observation set = Fall 2009) and
a second set of observations completed at the end
of the school year (e.g., observation set = Spring
2010). Or you might compare sets of observations
done before and after an intervention. By compar-
ing DOF scores obtained from observation sets for
different time periods, users can distinguish be-
9. Answers to Frequently Asked Questions 112
tween problems that are quite consistent across
time versus those that are more variable. Users may
also choose to compare observation sets for differ-
ent situations, such as math class versus reading
class. In addition, the DOF scores can be compared
with the scores on analogous scales of other
ASEBA forms, which include counterparts of many
DOF items and scales. Assessment data obtained
from interviews of children, parents, and teachers,
medical exams, cognitive and achievement tests,
and behavioral and family assessment can also be
compared with DOF data to provide a comprehen-
sive basis for assessment, as discussed in Chap-
ters 1 and 5.
2. How do DOF scales compare with
scales scored from other ASEBA
forms?
Answer: Although the scales of other ASEBA
forms were derived independently from item pools,
samples of participants, and raters that differ from
those of the DOF, the following DOF scales have
counterparts on profiles scored from most ASEBA
forms for children and youth: Sluggish Cognitive
Tempo (similar to 2007 CBCL/6-18 and TRF Slug-
gish Cognitive Tempo), Immature/Withdrawn
(similar to CBCL/6-18 and TRF Withdrawn/De-
pressed), Attention Problems (similar to CBCL/6-
18 and TRF Attention Problems), Oppositional
(similar to TOF Oppositional and CBCL/6-18 and
TRF Aggressive Behavior), the DSM-oriented At-
tention Deficit/Hyperactivity Problems scale and
subscales (similar to CBCL/6-18, TRF, and TOF
Problems scale and Inattention and Hyperactivity-
Impulsivity subscales). DOF Total Problem scores
can also be compared to Total Problems on other
ASEBA forms.
3. What if there are differences between
a childs pattern of problems on the
DOF Profile versus the childs
patterns of problems on other ASEBA
profiles?
Answer: Discrepancies between findings from
different assessment procedures can be as infor-
mative as similarities. For example, if the DOF was
used to obtain observations of a child in a particu-
lar school classroom, then DOF scale scores could
be compared to analogous scale scores on the TRF
completed by the childs teacher in the same class-
room and the CBCL/6-18 completed by one or both
parents. DOF scale scores could then be compared
with scores on analogous scales of the CBCL/6-
18 and TRF to see if the childs observed behavior
was different from behavior reported by the teacher
or the parents. If a child has more than one teacher,
observations with the DOF could be done in both
teachers classrooms and then scored on separate
DOF Profiles. If the DOF scores from the two class-
room settings differed on certain scales, then fur-
ther observations and interviews with the teachers
would be appropriate to determine why the child
may behave differently in the two classrooms.
Comparisons of DOF scores with test session ob-
servations scored on the TOF and interviewer rat-
ings scored on the SCICA may also help to docu-
ment discrepancies and consistencies between
problems observed in group settings, such as school
classrooms, versus elsewhere.
RELATIONS TO DSM AND SPECIAL
EDUCATION CLASSIFICATIONS
1. How can the DOF contribute to an
ADHD diagnosis and other DSM
diagnoses?
Answer: Because the DSM criteria for behav-
ioral and emotional disorders are not defined in
terms of specific assessment procedures, scores on
the DOF items and scales may be combined with
other kinds of data in judging whether the criteria
for DSM diagnoses are met. The 23 items of the
DOF DSM-oriented Attention Deficit/Hyperactiv-
ity Problems scale have fairly clear counterparts
among the symptom criteria for ADHD as defined
by DSM-IV (American Psychiatric Association,
1994) and DSM-IV-TR (American Psychiatric
Association, 2000). High scores on the DOF At-
113
tention Problems syndrome may also suggest that
DSM criteria for ADHD should be considered.
High scores on the DOF Oppositional syndrome
may suggest that DSM criteria for Oppositional
Defiant Disorder should be considered. At the same
time, to formulate DSM diagnoses, DOF results
must be combined with other assessment data, in-
cluding parent reports, teacher reports, and test
results as appropriate. The CBCL/6-18, TRF, and
TOF also have an Attention Problems syndrome
and a DSM-oriented Attention Deficit/Hyperactiv-
ity Problems scale, as well as Inattention and Hy-
peractivity-Impulsivity subscales, that can contrib-
ute useful information for making DSM diagnoses
of ADHD.
2. How can the DOF be used in
determining eligibility for special
education according to disability
categories, such as those defined by
the 2004 Individuals with Disabilities
Education Improvement Act (IDEA
2004)?
Answer: Categories of educational disabilities
are not defined in terms of specific tests and other
assessment procedures. However, IDEA 2004 does
require direct observations of children in school
as part of a comprehensive assessment for deter-
mining eligibility for special education services.
The DOF provides a structured format for conduct-
ing observations and produces a standardized pro-
file that documents the results of the observations.
The DOF, along with other ASEBA forms, can thus
provide important quantitative data for judging
whether children have the kinds of problems for
which particular special education services are in-
tended, as discussed in Chapter 5.
114
Abramowitz, M., & Stegun, I.A. (1968). Handbook
of mathematical functions. Washington, DC: National
Bureau of Standards.
Achenbach, T. M. (1966). The classification of
childrens psychiatric symptoms: A factor-analytic
study. Psychological Monographs, 80, (No. 615).
Achenbach, T. M. (1981). The Direct Observation
Form of the Child Behavior Checklist (rev. ed.)
Burlington, VT: University of Vermont, Department
of Psychiatry.
Achenbach, T. M. (1986). The Direct Observation
Form of the Child Behavior Checklist (rev. ed.)
of Psychiatry.
Achenbach, T.M. (1991a). Manual for the Child Be-
havior Checklist/4-18 and 1991Profile. Burlington,
VT: University of Vermont, Department of Psychia-
try.
Achenbach, T.M.(1991b). Manual for the Teachers
Report Form and 1991Profile. Burlington, VT: Uni-
versity of Vermont, Department of Psychiatry.
Achenbach, T.M. (1991c). Manual for the Youth Self-
Report and 1991 Profile. Burlington, VT: University
of Vermont, Department of Psychiatry.
Achenbach, T. M., & Edelbrock, C. (1983). Manual
for the Child Behavior Checklist/4-18 and Revised
Child Behavior Profile. Burlington, VT: University
of Vermont, Department of Psychiatry.
for the Teachers Report Form and Teacher Version
of the Child Behavior Profile. Burlington, VT: Uni-
versity of Vermont, Department of Psychiatry.
for the Youth Self-Report and Profile. Burlington, VT:
University of Vermont, Department of Psychiatry.
Achenbach, T. M., & Lewis, M. (1971). A proposed
model for clinical research and its application to en-
copresis and enuresis. Journal of the American Acad-
emy of Child Psychiatry, 10, 535-554.
Achenbach, T. M., & McConaughy, S. H. (1997).
Empirically based assessment of child and adoles-
cent psychopathology: Practical applications. Thou-
sand Oaks, CA: Sage.
Achenbach, T. M., & Rescorla, L. A. (2000). Manual
for the ASEBA Preschool Forms & Profiles.
of Psychiatry.
Achenbach, T. M., & Rescorla, L. A. (2001). Manual
for the ASEBA School Age Forms & Profiles.
Burlington, VT: University of Vermont, Research
American Education Research Association, Ameri-
can Psychological Association, & National Council
on Measurement in Education. (1999). Standards
for educational and psychological testing. Washing-
ton, D.C.: American Education Research Association.
American Psychiatric Association. (1994). Diagnos-
tic and statistical manual of mental disorders-fourth
edition. Washington, DC: Author.
American Psychiatric Association. (2000). Diagnos-
tic and statistical manual of mental disorders-fourth
edition-text revision. Washington, DC: Author.
Barkley, R. A. (2006). Attention deficit hyperactivity
disorder: A handbook for diagnosis and treatment
(3
rd
ed.). New York: Guilford Press.
References
References 115
Browne, N. W., & Cudeck, R. (1993). Alternative
ways of assessing model fit. In K. A. Bollen, & J. S.
Long (Eds.), Testing structural equation models (pp.
136-162). Newbury Park, CA: Sage.
Cohen, J. (1988). Statistical power analysis for the
behavioral sciences (2nd ed.). New York: Academic
Press.
Chafouleas, S. M., Christ, T. J., Riley-Tillman, T. C.,
Briesch, A. M., & Chanese, J. A. M. (2007).
Generalizability and dependability of direct behav-
ior ratings to assess social behavior of preschoolers.
School Psychology Review, 36, 63-79.
Crocker, L., & Algina, J. (1986). Introduction to clas-
sical and modern test theory. New York: Holt,
Rinehart, Winston.
Cronbach, L.J. (1951). Coefficient alpha and the in-
ternal structure of tests. Psychometrika, 16, 297-334.
DuPaul, G.J., & Stoner, G. (2003). ADHD in the
schools. (2
nd
ed.). New York: Guilford Press.
Gove, P. (Ed.). (1971). Websters third new interna-
tional dictionary of the English language. Springfield,
MA: Merriam.
Guze, S. (1978). Validating criteria for psychiatric di-
agnosis: The Washington University approach. In
M.S. Akiskal & W.L. Webb (Eds.), Psychiatric diag-
nosis: Exploration of biological predictors (pp. 49-
59). New York: Spectrum.
Hintze, J. M. (2005). Psychometrics of direct obser-
vation. School Psychology Review, 34, 507-519.
Hintze, J. M., & Matthews, W. J. (2004). The
generalizability of systematic direct observations
across time and settings: A preliminary investigation
of the psycholometrics of behavioral observation.
Frick, P. J., Lahey, B. B., Applegate, B., Kerdyck, L.,
Ollendick, T., Hynd, G. W., et al. (1994). DSM-IV
field trials for the disruptive behavior disorders:
Symptom utility estimates. Journal of the American
Academy of Child and Adolescent Psychiatry, 33, 529-
539.
Individuals with Disabilities Education Improvement
Act of 2004. Public law. No. 108-446, 118 Stat. 2647
(2004). [Amending 20 U.S.C. 1400 et seq.]
Joint Committee on Testing Practices. (2004). Code
of fair testing practices in education. Washington,
D.C.: American Psychological Association.
Kaufman, A.S., & Kaufman, N.L. (1983). Kaufman
Assessment Battery for Children. Circle Pines, MN:
American Guidance Service.
Leff, S. S., & Lakin, R. (2005). Playground-based
observational systems: A review and implications for
practitioners and researchers, School Psychology
Review, 34, 475-489.
Loehlin, J. C. (1998). Latent variable models: An
introduction to factor, path, and structural analysis
(3
rd
ed.). Mahwah, NJ: Lawrence Erlbaum Associ-
ates.
McConaughy, S.H. (2005). Clinical interviews for
children and adolescents: Assessment to intervention.
New York: Guilford Press.
McConaughy, S.H., & Achenbach, T.M. (2003). Di-
rect Observation Form (Research Ed.). Burlington,
VT: University of Vermont, Research Center for Chil-
dren, Youth, & Families.
McConaughy, S. H., & Achenbach, T. M. (2001)
Manual for the Semistructured Clinical Interview for
Children and Adolescents-Second Edition.
Burlington, VT: University of Vermont, Research
McConaughy, S.H., & Achenbach, T.M. (2004).
Manual for the Test Observation Form for Ages 2-
18. Burlington, VT: University of Vermont, Research
McConaughy, S. H., Achenbach, T. M. & Gent, C. L.
(1988). Multiaxial empirically-based assessment:
Parent, teacher, observational, cognitive, and person-
ality correlates of Child Behavior Profiles for 6-11
116 References
year-old boys. Journal of Abnormal Child Psychol-
ogy, 16, 485-509.
McConaughy, S. H., Kay, P. J., & Fitzgerald, M.
(1999). The Achieving, Behaving, Caring Project for
preventing ED: Two-year outcomes. Journal of Emo-
tional and Behavioral Disorders, 7, 224-239.
McConaughy, S. H., Kay, P. J., & Fitzgerald, M.
(1998). Preventing SED though parent-teacher action
research and social skills instruction: First-year out-
comes. Journal of Emotional and Behavioral Disor-
ders, 6, 81-93.
McConaughy, S.H., Mattison, R.E., & Peterson, R.L.
(1994). Behavioral/emotional problems of children
with serious emotional disturbance and learning dis-
abilities. School Psychology Review, 23, 81-98.
McConaughy, S.H., & Ritter, D. (2008). Best prac-
tices in multimethod assessment of emotional and
behavioral disorders. In A. Thomas & J. Grimes
(Eds.), Best practices in school psychology-V, Vol-
ume 3, (pp. 697-716), Bethesda, MD: National As-
sociation of School Psychologists.
McConaughy, S.H., & Skiba, R. (1993). Comorbidity
of externalizing and internalizing problems. School
Psychology Review, 22, 419-434.
Muthn, L.K., & Muthn, B.O. (2001). Mplus Users
Guide (Version 2). Los Angeles, CA: Muthn &
Muthn.
Naglieri, J. A., & Das, J. P. (1997). Cognitive Assess-
ment System. Itasca, IL: Riverside Publishing.
Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1993).
Scaling, norming, and equating. In R.L. Linn (Ed.),
Educational measurement (3
rd
ed.), (pp.221-262).
Washington, D.C.: American Council on Education.
Reed, M. L., & Edelbrock, C. (1983). Reliability and
validity of the Direct Observation Form of the Child
Behavior Checklist. Journal of Abnormal Child Psy-
chology, 11, 521-530.
Rehabilitation Act of 1973, Section 504. (1973). 29
U.S.C. 706, 1996; 504 [ 30 C.F.R Part 104].
Roid, G. H. (2003). Stanford-Binet Intelligence Scales,
Fifth Edition. Itasca, IL: Riverside Publishing.
Sakoda, J. M., Cohen, B. H., & Beall, G. (1954). Test
of significance for a series of statistical tests. Psy-
chological Bulletin, 51, 172-175.
Sattler, J. M. (2008). Assessment of children: Cogni-
tive applications (5
th
ed.). La Mesa, CA: Author.
Sattler, J. M., & Hoge, R. D. (2006). Assessment of
children: Behavioral, social, and clinical foundations
(5
th
ed.). La Mesa, CA: Jerome Sattler, Publisher, Inc.
Shapiro, E. S., & Heick, P. (2004). School psycholo-
gist assessment practices in the evaluation of students
referred for social/behavioral/emotional problems.
Psychology in the Schools, 41, 551-561.
Shapiro, E. S., & Kratochwill, T. R. (Eds.) (2000).
Introduction: Conducting a multidimensional behav-
ioral assessment. In E.S. Shapiro & T.R. Kratochwill
(Eds.), Conducting school-based assessments of child
and adolescent behavior (pp. 1-20). New York:
Guilford.
Skansgaard, E. P., & Burns, G. L. (1998). Compari-
son of DSM-IV ADHD combined and predominantly
inattention types: Correspondence between teacher
ratings and direct observations of inattentive, hyper-
activity/impulsivity, slow cognitive tempo, opposi-
tional defiant, and overt conduct disorder symptoms.
Child & Family Behavior Therapy, 20, 1-14.
SPSS. (2007). SPSS Base 15.1 Users Guide. Chi-
cago, IL: SPSS.
Tilly, W.D. (2008). The evolution of school psychol-
ogy to science-based practice: Problem solving and
the three-tiered model. In A. Thomas & J. Grimes
117
References
(Eds.), Best practices in school psychology-V, Vol-
ume 1, (pp. 17-36). Bethesda, MD: National Asso-
ciation of School Psychologists.
Volpe, R.J., & McConaughy, S.H. (2005). (Guest Edi-
tors). Systematic direct observational assessment of
student behavior: Its use and interpretation in mul-
tiple settings: An introduction to the Mini-series,
Volpe, R. J., DiPerna, J. C., Hintze, J. M., & Shapiro,
E. S. (2005). Observing students in classroom set-
tings: A review of seven coding systems. School Psy-
chology Review, 34, 454-474.
Wechsler, D. (2002). Wechsler Individual Achieve-
ment Tests-Second Edition. San Antonio, TX: Psy-
chological Corporation.
Wechsler, D.C. (2003). Wechsler Intelligence Scale
for Children-Fourth Edition. San Antonio, TX: Psy-
chological Corporation.
Wilson, M.S., & Reschly, D.J. (1996). Assessment in
school psychology training and practice. School Psy-
chology Review, 25, 9-23.
Woodcock, R.W., McGrew, K., & Mather, N. (2001).
Woodcock-Johnson III. Itasca, IL: Riverside Publish-
ing.
118
119
120
Appendix B
121
122
APPENDIX D
ITEMS COMPRISING THE 2009 DOF AND THE 1986 DOF
2009 DOF Items 1986 DOF Items
1. Acts too young for age 1. Acts too young for age
2. Makes odd noises 2. Makes odd noises
3. Argues 3. Argues
4. Cheats 4. Cheats
5. Defiant or talks back to staff 5. Defiant or talks back to staff
6. Brags, boasts 6. Bragging, boasting
7. Doesnt concentrate or doesnt pay attention for long 7. Doesnt concentrate or doesnt pay attention for long
8. Difficulty waiting turn in activities or tasks 8. Cant get mind off certain thoughts; obsessions
(describe):
9. Doesnt sit still, restless, or hyperactive 9. Doesnt sit still, restless, or hyperactive
10. Clings to adults or too dependent 10. Clings to adults or too dependent
11. Confused or seems to be in a fog 11. Confused or seems to be in a fog
12. Cries 12. Cries
13. Fidgets, including with objects 13. Fidgets, including with objects
14. Cruel, bullies, or mean to others 14. Cruelty, bullying, or meanness
15. Daydreams or gets lost in thoughts 15. Daydreams or gets lost in thoughts
16. Difficulty following directions 16. Deliberately harms self
17. Tries to get attention of staff 17. Tries to get attention of staff
18. Destroys own things 18. Destroys own things
19. Destroys property belonging to others 19. Destroys property belonging to others
20. Disobedient 20. Disobedient
21. Disturbs other children 21. Disturbs other children
22. Doesnt seem to feel guilty after misbehaving 22. Doesnt seem to feel guilty after misbehaving
23. Doesnt seem to listen to what is being said 23. Shows jealousy
24. Eats, drinks, chews, or mouths things that are not 24. Eats, drinks, chews, or mounths things that are not
food, excluding junk foods (describe): food, excluding tobacco and junk foods (describe):
25. Difficulty organizing activities or tasks 25. Shows fear of specific situations or stimuli (describe):
26. Fails to give close attention to details 26. Says no one likes him/her
27. Forgetful in activities or tasks 27. Says others are out to get him/her
28. Out of seat 28. Expresses feelings of worthlessness or inferiority
29. Gets hurt, accident prone 29. Gets hurt, accident prone
30. Gets in physical fights 30. Gets in physical fights
31. Gets teased 31. Gets teased
32. Interrupts 32. Hears things that arent there (describe):
33. Impulsive or acts without thinking, including calling 33. Impulsive or acts without thinking, including calling
out in class out in class
34. Physically isolates self from others 34. Physically isolates self from others
35. Lies 35. Lying
36. Bites fingernails 36. Bites fingernails
37. Nervous, highstrung, or tense 37. Nervous, highstrung, or tense
38. Nervous movements, twitching, tics, or other unusual 38. Nervous movements, twitching, tics or other unusual
movements (describe): movements (describe):
39. Loses things 39. Overconforms to rules
40. Too fearful or anxious 40. Too fearful or anxious
41. Physically attacks people 41. Physically attacks people
Note. Bold font in the first column shows new items added to the 2009 DOF. Bold italic font in the second column shows
1986 DOF items that were not included on the 2009 DOF.
a
Items 62 through 89 on the 2009 DOF have counterparts on the 1986 DOF, but the item numbers were changed as can be
seen by comparing item numbers across columns.
Appendix D 123
42. Picks or scratches nose, skin, or other parts of body 42. Picks or scratches nose, skin, or other parts of body
(describe): (describe):
43. Runs about or climbs excessively 43. Falls asleep
44. Apathetic, unmotivated, or wont try 44. Apathetic, unmotivated, or wont try
45. Responds before instructions are completed 45. Refuses to talk
46. Disrupts group activities 46. Disrupts group activities
47. Screams 47. Screams
48. Secretive, keeps things to self, including refusal to 48. Secretive, keeps things to self, including refusal to
show things to teacher show things to teacher
49. Avoids or is reluctant to do tasks that require 49. Sees things that arent there (describe):
sustained mental effort
50. Self-conscious or easily embarrassed 50. Self-conscious or easily embarrassed
51. Slow to respond verbally 51. Sexual activity (describe):
52. Shows off, clowns, or acts silly 52. Shows off, clowns, or acts silly
53. Shy or timid 53. Shy or timid
54. Explosive or unpredictable behavior 54. Explosive or unpredictable behavior
55. Demands must be met immediately, easily frustrated 55. Demands must be met immediately, easily frustrated
56. Easily distracted by external stimuli 56. Easily distracted by external stimuli
57. Stares blankly 57. Stares blankly
58. Speech problem (describe) 58. Acts like feelings are hurt when criticized
59. Wants to quit or does quit tasks 59. Steals
60. Yawns 60. Stores up things he/she doesnt need, except hobby
items such as marbles (describe):
61. Strange behavior (describe): 61. Strange behavior (describe):
62. Strange ideas (describe):
a
64. Sudden changes in mood or feelings
63. Sulks
a
65. Sulks
66. Suspicious
a
68. Talks about killing self
65. Talks too much
a
69. Talks too much
66. Teases
a
70. Teases
a
72. Verbal expressions of preoccupation with sex
a
a
70. Underactive, slow moving, tired, or lacks energy
a
75. Underactive, slow moving, or lacks energy
71. Unhappy, sad, or depressed
a
76. Unhappy, sad, or depressed
72. Unusually loud
a
77. Unusually loud
a
74. Whining tone of voice
a
79. Whining tone of voice
75. Withdrawn, doesnt get involved with others
a
80. Withdrawn, doesnt get involved with others
81. Worries
a
77. Fails to express self clearly
a
83. Fails to express self clearly
APPENDIX D (CONT.)
a
Appendix D 124
78. Impatient
a
84. Impatient
79. Tattles
a
85. Tattles
80. Repeats behavior over & over; compulsions (describe):
a
86. Repeats behavior over & over; compulsions
(describe):
81. Easily led by peers
a
87. Easily led by peers
82. Clumsy, poor motor control
a
88. Clumsy, poor motor control
83. Doesnt get along with peers
a
89. Doesnt get along with peers
a
a
86. Bossy
a
92. Bossy
93. Plays with younger children
87. Complains
a
94. Complains
a
89. Other problems not listed above:
a
96. Acts like poor loser
97. Other problems (specify):
a
APPENDIX D (CONT.)
125
A
Abramowitz, M., 84, 114
Achenbach, T.M., 2-3, 30, 56, 58, 62, 71-73, 78,
81, 86, 97-98, 114-116
ADHD, 61, 65, 73, 82, 109
Aggressive Behavior, 1, 33, 37, 79, 81
Algina, J., 84, 115
Alpha, 94-96
American Education Research Association, 114
American Psychiatric Association, 1, 30, 58, 73, 81, 114
Applegate, B., 115
Attention Deficit/Hyperactivity Problems , 1, 23, 30-32, 81-83,
109
Attention Problems, 23, 77, 79
B
Barkley, R.A., 58, 61, 114
Beall, G., 85, 116
Borderline range, 26-28, 30-31, 37, 88, 110
Briesch, A. M., 91, 115
Browne, N. W., 76, 115
Burns, G. L., 72, 78, 116
C
Case Management, 59, 67, 70
CBCL/6-18, 60-63, 65, 68, 70-71, 78-79, 81, 97
Chafouleas, S. M., 91, 115
Chanese, J. A. M., 91, 115
Christ, T. J., 91, 115
Classroom observations, 23-24, 33-35, 66-67, 77,
82, 84-85, 87, 93, 95-96, 99-100, 104, 106, 107
Clinical interpretations, 110
Clinical range, 26-28, 30-31, 37, 88, 110
Cohen, B. H., 85, 116
Cohen, J., 91, 115
Computer-scoring program, 23
Content validity, 97
Continuous recording methods, 2
Control children, 1, 5, 10, 11, 23, 25, 45
Criterion-related validity, 98
Crocker, L., 84, 115
Cronbach, L.J., 91, 94, 96, 115
Cudeck, R., 76, 115
D
Das, J. P., 2, 116
Diagnosis, 112
DiPerna, J. C., 1, 117
Disabilities, 111
DOF profile, 108
DSM , 1, 30, 58, 78, 81-82, 109, 112
DSM-oriented scales 31, 81, 85
DuPaul, G. J., 61, 115
E
Edelbrock, C., 71-72, 97, 114, 116
Emotional disturbance, 61, 63
Ethnicity, 84, 99, 100, 103-104
F
Factor analyses, 74, 75, 76, 79
Fitzgerald, M., 72, 115, 116
Frick, P. J., 78, 115
Functional behavioral assessment, 60, 69
G
Gender , 100, 103, 108
Gent, C. L., 72, 115
Gove, P., 73, 115
Guidelines for rating problem items, 15
Guze, S., 58, 115
H
Heick, P., 1, 116
Hintze, J. M., 1, 46, 47, 55, 91, 115, 117
Hoge, R. D., 1, 58, 116
Hoover, H. D., 86, 116
Hynd, G. W., 115
Hyperactivity-Impulsivity, 1, 30-32, 81-83, 109
I
ID number, 10
Identified child, 1, 5, 10, 11, 23, 25, 44
Immature/Withdrawn, 23, 77-78, 82
Inattention, 1, 30-32, 81-83, 109
Individualized Education Program (IEP), 59
Individuals with Disabilities Education Improvement, 115
Inter-observer agreement, 46, 48-50, 52-53
Inter-rater reliability, 51, 91-93
Internal consistency, 91, 94
Intrusive, 23, 77
Index
Index 126
K
Kaufman, A.S., 3, 115
Kaufman, N.L., 3, 115
Kay, P. J., 72, 116
Kerdyck, L., 115
Kolen, M. J., 86, 116
Kratochwill, T. R., 58, 116
L
Lahey, B. B., 115
Lakin, R., 1, 115
Learning disabilities, 62
Leff, S. S., 1, 115
Lewis, M., 97, 114
Loehlin, J. C., 76, 115
Low scores, 110
M
Mather, N., 3, 117
Matthews, W. J., 91, 115
Mattison, R.E., 62, 116
McConaughy, S. H., 2-3, 30, 58, 60, 62, 72-73, 98,
114-116
McGrew, K., 3, 117
Mean T scores, 88
Multiaxial assessment, 2-3
Multidisciplinary team (MDT), 59
Multisource data, 59
Muthn, B.O., 74, 116
Muthn, L.K., 74, 116
N
Naglieri, J. A., 2, 116
Narrative report, 33, 35, 38, 40
Normal range, 26, 28, 30, 31, 37, 88
Normalized T scores 82-84
Normative samples, 82, 84-85
O
Observation set, 11, 23, 25
Observers notes, 12, 14
Ollendick, T., 115
On-task, 12-14, 23, 30, 47-50, 87-88, 110
Oppositional, 23, 78-79
Other problems, 23, 28, 33, 37, 79-80, 109
Outcome evaluation, 59, 67, 70
P
Percent agreement index, 46
Percentiles, 25, 28, 30-31, 33, 37, 110
Petersen, N. S., 62, 86, 116
Peterson, R. L., 62, 116
Problem items, 13, 15, 47, 52-54
Profile, 23
R
Recess observations, 36, 38-40, 67, 79, 81, 84-85,
87, 93, 96, 99-100, 102, 104, 106, 107
Reed, M. L., 72, 116
Referral status, 99-100, 103-107
Rehabilitation Act of 1973, 116
Reliability, 91
Reschly, D.J., 1, 117
Rescorla, L. A., 2-3, 30, 56, 58, 72-73, 78, 81, 86,
97-98, 114
Response-to-Intervention (RTI), 60, 67
Riley-Tillman, T. C., 91, 115
Ritter, D., 60, 62, 116
Roid, G. H., 116
S
Sakoda, J. M., 85, 95, 99, 116
Sattler, J. M., 1, 55, 58, 116
School psychologist, 59-60
SCICA, 58, 63, 79, 97
Section 504 Accommodations 61, 67
Setting, 12
Shapiro, E. S., 1, 58, 116-117
Skansgaard, E. P., 72, 78, 116
Skiba, R., 116
Sluggish Cognitive Tempo, 23, 76-78, 105
Special education, 60, 112
SPSS, 74, 116
Stegun, I.A., 84, 114
Stoner, G., 61, 115
Syndromes, 73, 77, 109
T
T Scores, 25, 28, 30-31, 33, 37, 83, 85-87, 110
Test-retest reliability, 91, 93, 95
Three-tiered model, 60, 69
Tilly, W.D., 60, 116
Time sampling, 2
TOF, 58-59, 61, 63, 66, 78-79, 97
Total problems, 23, 28, 33, 37, 87
Training observers, 41
TRF, 60-63, 66, 68, 70-71, 78-79, 81, 97
V
Validity, 96
Volpe, R.J., 1, 2, 58, 116-117
W
Wechsler, D., 2, 65, 116-117
Wilson, M.S., 1, 117
Woodcock, R.W., 2, 117
Y
YSR, 63, 81, 97
ibrary of Congress Control Number: xxxxxxx
ISBN 978-1-932975-12-3
t
h
e

A
S
E
B
A

D
i
r
e
c
t

O
b
s
e
r
v
a
t
i
o
n

F
o
r
m
M
c
C
o
n
a
u
g
h
y

&

A
c
h
e
n
b
a
c
h
a
l

f
o
r

t
h
e

A
S
E
B
A

D
i
r
e
c
t

O
b
s
e
r
v
a
t
i
o
n

F
o
r
m
M
c
C
o
n
a
u
g
h
y

&

A
c
h
e
n
b
a
c
h

2009 DOF Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2009 DOF Manual

Uploaded by

Copyright:

Available Formats

Manual for the ASEBA

Direct Observation Form

2. Makes odd noises 2 3

18. Destroys own things

53. Shy or timid 1

55. Demands must be met immediately,

69. Too concerned with neatness or cleanliness

You might also like