You are on page 1of 78

Series 2

DATA EVALUATION AND METHODS RESEARCH


Number 15

evaluationof

Psychological
Measures
Usedin the Health
ExaminationSurvey
of children, ages 6-11

A critical review of literature pertaining to the psy-


chological measures used in Cycle 11, with recommen-
dations concerning validity, reliability, and applica-
bil ity to the Survey data.

DHEW Publication No. (HRA) 75-1295

U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE


Public Health Service

Health Resowrces Administration


National Center for Health Statistics

Rockville, Maryland
Vital and Health Statistics-Series 2-No.15
First issued in the public Health Service publication No, 1000
March 1966
NATIONAL CENTER FOR HEALTH STATISTICS

EDWARD B. PERRIN, Ph. D., Director

PHILIP S. LAWRENCE, SC.D., Deputy Director


JACOB J. FELDMAN, Ph. D., Acting Associate Director for Analysis
GAIL F. FISHER, Associate Director for the Cooperative Health Statistics System
ELIJAH L. WHITE, Associate Director for Data Systems
1WAO M. MORIYAMA, Ph ,D., Associate Director for International Statistics
EDWARD E. MINTY, Associate Director for Management
ROBERT A. ISRAEL, Associate Director for Operations
QUENTIN R. REMEIN, Associate Director for Program Development
PHILIP S. LAWRENCE, SC.D., Acting Associate Director for Research
ALICE HAYWOOD, Information Officer

Library of Congress Catalog Card Number 65-62272


FOREWORD
The practice of comparing one individual with conducted by the Health Examination Survey
another is as old as recorded history. Mans encounter difficult problems in attempting to esti-
earliest writings are replete with statements in- mate the prevalence of various mental health
dicating that he has long viewed his fellow man in factors in the population.
terms of whether or not he measured up to an The Health Examination Survey is part of the
expected ideal. Similarly, the performance of a U.S. National Health Survey, authorized by
man has traditionally been described in terms of Congress in 1956 to collect information about the
how it compares with that of another man. Nations health. Data are collected by direct
However, subjecting these known differences to examinations of individual persons chosen to
the scientific method of inquiry is a recent constitute a probability sample of some segment of
development. the total population of the United States.
In the area of individual differences ir The first sample represented the adult popu-
behavior and psychological characteristics. re- lation aged 18 through 79 years. Since the study
search has progressed from the simple to the was primarily concerned with the prevalence of
complex. The first studies dealt with the simple chronic physical disease, the examination did not
functions of speed of reaction time. Today, studies include psychological measurements. The second
are aimed at measuring individual differences ifi sample consisted of noninstitutionalized children
the complex functions of motivation, ego- integra- ages 6 through 11, among whom the incidence of
tion, and cognition. chronic disease is insignificant; The important
Progress in developing a technology for health factors in this group are found in those
measuring behavior has progressed in a similar functions which result in growth and development.
manner. Instruments are available which, most These, then, were the factors to be studied.
scientists will agree, accurately measure the Many authorities in the field of growth and
speed with which an individual taps his finger in development contributed to the planning phase of
response to a given signal. Scientists do not the Survey. Although they generally agreed on what
agree, however, on the adequacy of the equipment factors should be measured, they could not agree
used to measure individual differences in intelli- on how the measurements should be obtained. They
gence. Moreover, there will even be some dis- did conclude that present instruments were inade-
agreement over the use of the word intelligence quate but that these were the only tools available.
to describe certain aspects of behavior. The tests which are discussed in the following
Because of the present state of the art of report were those selected for use by the Health
psychological measurement, studies such as those Examination Survey. In choosing these instru-
FOREWORDCon,

ments, primary consideration was given to those The selected instruments are not ideal, but
which best met the following criteria: they are felt to be the best compromise offered
by the present state of the art of measurement.
1. They were capable of yielding data in How much was compromised? What can be
those areas considered most important said about the growth and development of chil-
to the study of growth and development. dren from the data obtained by the use of these
2. They would produce data in a form which instruments?
Through a contractual arrangement with Dr.
would be meaningful to the individuals
Sells, the first step has been taken in answering
responsible for childrens health.
these questions.
3. They were suitable for use in a survey
operation where examiners change fre-
quently, where only 1 hour is available Lois R. Chathaml, Ph.D.
to conduct the examination, and where Psychological Advisor
examining conditions are less than opti- Division of Health Exam-
mal. ination Statistics
CONTENTS

Foreword ------------------------------------------------------------ i

Introduction ---------------------------------------------------------- 1

I. TheWechsler Intelligence Scale for Children,


the Vocabulary and Block Design Subtests-------------l --------------- 2
Description of tieWISC ------------------------------------------- 2
Research on Short Forms of the WISC------------------------------ 3
Reliability and Stability ------------------------------------------- 4
Validity --------------------------------------------------------- 4
Factors Affecting WISC Scores ------------------------------------ 10
Anxiety ------------------------------------------------------- 10
Sex Differences ----------------------------------------------- 11
Qualitative Differences byLevel --------------------------------- 11
Mvelopmental Factors ----------------------------------------- 12
Special Groups -------------------------------------------------- 12
Reading Disabili~--------------------------------------------- 12
Auditory Disability -------------------------------------------- 13
Visually Handicapped ------------------------------------------ 13
Stutterers ---------------------------------------------------- 13
Cerebral Palsy-------------T ---------------------------------- 14
Organic Impairment of Central Nervous System ------------------- 14
Gifted -------------------------------------------------------- 14
Mentally Retarded andDefective --------------------------------- 14
Bilingual ------------------------------- ---------------------- 14
Negro -------------------------------------------------------- 15
Socioeconomic Status ------------------------------------------ 15
Comparison ofWISC and Stanford-Binet ITs ------------------------ 15
Summary and Conclusions ---------------------------------------- 17
Bibliography ---------------------------------------------------- 18
CONTENTSCon.

Page
H. The Wide Range Achievement Test, the Oral Reading
and Arithmetic Subtests -------------------------------------------- 23
Evaluative Criteria ---------------------------------------------- 23
1946 Edition of WRAT -------------------------------------------- 24
Research onthe 1946WRAT --------------------------------------- 25
Reading ---------- ------- -------- -------- ------- -------- ------ 25
Arithmetic ---------------------------------------------------- 29
1963 Edition of WRAT -------------------------------------------- 29
Validity and Norms -------------------------------------------- 30
Comparisonof the Two Editions --------------------------------- 30
Validation of 1963Edition --------------------------------------- 30
Validity Variances --------------------------------------------- 31
Validity Datain 1963 Manual ------------------------------------ 31
Grade Equivalents --------------------------------------------- 32
Standard Scores ----------------------------------------------- 32
Percentiles --------------------------------------------------- 33
Summary andConclusions ----------------------------------------- 33
Bibliography ---------------------------------------------------- 33

111. Th~Goodeno~gh~aw-A.Man Test ---------------------------------- 34


Background and Development -------------------------------------- 34
Rationale ----------------------------------------------------- 34
Point-Scoring System ------------------------------------------ 34
Standardization ------------------------------------------------ 35
Perspective ------- - -- - - - ---- - - - --- - ---- - -- - - -- - - - - - - - -- - - - - - - - 35

Evaluation of Intelligence by Human Figure ~awings ---------------- 36


Effective Range ----------------------------------------------- 36
Relation to ArtisticAbilitY -------------------- ------------------ 36
Perturbing Factors ------- ------- -------- ------- ------- ------- - 36
Culture -- - --------------------- - - - - ---- - - -- - -- - - -- - - - -- - -- - - -- - 36
Sex Differences--- -------- --------- ------- ------- -------- ------ 38
CONTENTSCon.

Page
III. The Goodenough Draw-A-Man TestCon.
Personality Study by Childrens Drawings --------------------------- 38
Research on the Goodenough Test ---------------------------------- 40
Reliability Smdies --------------------------------------------- 40
Correlations Witi Otier Tests ---------------------------------- 40

The Harris Revision ofthe Goodenough Test ------------------------ 46


Comparison of Goodenough and Goodenough-Harris Scores --------- 47
Recommendation --------------------------------------------- 49
Summary andConclusions -------------------------- --------------- 49
Bibliography ---------------------------------------------------- 50

IV. The Thematic Apperception Test -------- --------------------- -------- 53

Review of the,Literature on the TAT ------- ------- ------- ------- --- 55


Overview ----------------------------------------------------- 55
ResearchDemonstrating Developmental Factors ------------------ 56
Other Relevmt Rese~ch --------------------------------------- 57

Prospects for Developing an Objective Scoring Key for the Surveys


TAT --------- ------- ------- ------- ------- -------- ------- ------- 59

Bibliography -------------- ------- ------- ------- -------------- --- 60

V. Total Psychological Test Battery ------------------------------------ 63

VI. Cross-DisciplinaryA nalyses -------- -------------------------- ------ 64

Data Available --------------------------------------------------- 65

Analyses Indicated ------- ------- -------------- ------- ------- ----- 65


Growti~dexes ------------------------------------------------ 66
Other Factors Related to Test Scores ---------------------------- 66

Acknowledgments ------- ------- -------------- -- -------------- ------- -- 66

Glossary of Abbreviations ---------------------------------------------- 67


IN THIS REPOR T the psychological procedures used in the Health Ex-
amination Survey conducted between June 1963 and December 1965 for
children ages 6 through 11 are critically evaluated.

In his analysis, the author combines his own professional competence


with the info~mation obtained in an extensive survey of literature per-
taining to the fouvprocedures usedthe Wechsler Intelligence Scale for
ChildYen, the Wide Range Achievement Test, a modification of the Draw-
A-Man Test, and the Thematic Apperception Test. The vesult is an
evaluation of the instruments which is made in terms of their validity,
reliability, and applicability for use in the Health Examination Survey.

Finally, the author points out the strengths and weaknesses of each pro-
cedure and makes recommendations concerning-the eventual use of duta
obtained in the Survey.

SYMBOLS

Data not available ------------------------ ---


Category not applicable ------------------ ...
Quantity zero ---------------------------- -
Quantity more than O but less than 0.05 ----- 0.0
Figure does not meet standards of
reliability or precision --------- --------- *
EVALUATION OF

PSYCHOLOGICAL MEASURES
USED IN THE HEALTH EXAMINATION SURVEY
OF CHILDREN AGES 6-11
S. B. Sells, Ph. D., Institute of Behavioral Research, Texas Christiaz Univemity

INTRODUCTION
This report is the outcome of a contract with analyses that can be performed on data
the National Center for Health Statistics. The obtained in the Health Examination Survey
purpose of the contract was to obtain an objective of children.
critical evaluation of the ps ychological procedures
An extensive survey of the literature was
chosen for use in the Health Examination Survey
made, but only the most relevant material was
of children ages 6 through 11. The objectives may
included in this final report. Literature was con-
be summarized as follows:
sidered relevant if it was either empirical re-
1. To prepare a critical review concerning search or a review which included or made ref-
the development and use of the ps ycholog- erence to the tests used in the Survey. Empirical
ical procedures used in Cycle II based on studies which were conducted on samples of U.S.
avaiIable literature and unpublished re- children ages 6 to 12 years were given preference.
ports (theses, dissertations, and others). A few important reports which did not meet these
These measures include the Vocabulary criteria were included because of their method-
and Block Design subtests of the Wechsler ological features or their significant content. Un-
Intelligence Scale for Children, the Oral published masters theses and dissertations were
Reading and Arithmetic subtests of the obtained, as extensively as possible, by inter-
Wide Range Achievement Test (1963 edi- library loan. Information was sought and, with
tion), the Draw-A-Man Test, and cards some success, obtained from the publishers and
1, 2, 5, 8BM, and 16 of the Thematic Ap- selected users of the reviewed tests.
perception Test. One empirical study was carried out under
2. To make recommendations concerning the this contract. Its results are included in the sec-
appropriate inferences which can be made tion on the Goodenough Draw-A-Man Test. The
concerning individual growth and develop- study was stimulated by a recent publication by
ment based on scores derived from the Dale B. Harris entitled Childvens Drawi~s as
test battery described above. Mea.szwes of Intellectual Matwity. This text is
3. To recommend what research must be basically a revision of the 1926 book by Florence
done if the objectives of the Health Ex- L. Go6denough entitled Measurement of Intelli-
amination Survey are to be accomplished. gence by Drawings. In his publication, Harris in-
4. To make original recommendations con- cludes new point-score scales and modernized
cerning the types of cross-disciplinary norms for scoring drawings of the human figure.

1
The text of this report is divided into six four sections of the report contain all references
sections Sections I-IV present critical discus- cited in the respective sections.
sions of various tests used by the Health Examina - Research studies which were abstracted as
tion Survey. The tests are discussed in the follow- part of the literature-review portion of this con-
ing order: tract are also included in the four bibliographies.
The actual abstracts of the reviewed literature
L The Wechsler Intelligence Scale for
appear as appendixes to the report. For conven-
Children, Vocabulary and Block Design
ience, numbers which identify the abstracts cor-
subtests
respond to the number given when the reference
II. The Wide Range Achievement Test, the
is cited in the text of the report.
Oral Reading and Arithmetic subtests
These abstracts have been deposited as docu-
HI. The Goodenough Draw-A-Man Test
ment number 8486 with the Library of Congress.
IV. The Thematic Apperception Test
A copy may be secured by sending the document
Section V briefly discusses some of the issues number and $28.80 for photoprints or $3.20 for
which arise when these tests are used as a bat- 35mm. microfilm to the American Documenta-
tery. Finally, section VI considers the cross- tion Institute Auxiliary Publication Project, Pho-
disciplinary relationships between ps ychologi- toduplication Service, Library of Congress, Wash-
cal and nonpsychological measures. ington, D.C., 20541. Advance payment is required.
Each research study or review referred to Checks or money orders should be made payable
in this report is identified by a number placed in to Chief, Photoduplication Service, Library of
parentheses immediately following the cited ref- Congress.
erence. Bibliographies following each of the first

I. THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN,


THE VOCABULARY AND BLOCK DESIGN SUBTESTS

This section reviews the measurement char- DESCRIPTION OF THE WISC


acteristics of the Vocabulary (Voc. ) and Block
Design (BD) subtests of the Wechsler Intelli- The WISC, which was published in 1949,
gence Scale for Children (WISC), both as a sepa- extended the well-known Wechsler intelligence
rate unit and as a WISC short form. It also reviews scales for adolescents and adults into the child-
behavioral correlates of intelligence as reported hood range of 5 to 15 years. During the decade
in the literature and critically evaluates the appro- and a half since its publication the WISC has
priateness of their use in Cycle II of the Health been the subject of extensive investigation and
Examination Survey. has achieved wide school and clinic use where
The selection of the Vocabulary and Block individual measures of intelligence are desired.
Design subtests for use as part of the psycho- The WISC is patterned after the Wechsler-
logical test battery for Cycle II, in effect, treats Bellevue Intelligence Scale both in the structure
these subtests as a short form of the WISC. In of the subtests and the scales and in the use of
addition to providing an estimate of the WISC the deviation intelligence quotient. The test con-
score, the two subtests may be interpreted sepa- sists of 12 subtests6 Verbal and 6 Perform-
rately, in combination with other test scores, or anceof which 2 (Digit Span of the Verbal Scale
in conjunction with other Survey data. Combina- and Mazes of the Performance Scale) are supple-
tions of these measures with other data obtained mentary and not routinely used. The 5 subtests
in the Survey are discussed in section H. comprising the Verbal Scale are as follows:

2
Information, Comprehension, Arithmetic, Simi- urban and smalltown areasas opposed to rural
larities, and Vocabulary. The 5 Performance Scale areasfor a native white population, the sampling
subtests are Picture Completion, Picture Ar- basis of the WISC has been regarded as good.
rangement, Block Design, Object Assembly, and Maxwell (106), and also Wilson (139), has
Coding (Digit Symbols). criticized the linearity of the transformation of
An important innovation in the Wechsler in- raw scores to scaled scores, which may be a
telligence tests is the use of the deviation IQ. problem when sampling extreme cases and widely
This device supplants the mental age concept and varying regional, ethnic, and linguistic groups.
evaluates the performance of each individual on Hite (112) reported that the WISC lacks items of
the basis of the distribution of scores of a repre- middle-range difficulty at all age levels and is too
sentative sample of his own chronological age. IrI difficult for young children, particularly those in
the standardization of the WISC, Wechsler kept the age range 5 to 6 years. In the studies reviewed,
the standard deviation of intelligence quotients WISC Full Scale IQs have indeed tended to be
constant from year to year, with the result that lower than comparable Stanford- Binet IQs. This
a childs obtained IQ does not vary unless his is especially true at the lower age levels. McCand-
actual test performance as compared with his less (103) noted that girls tend to test lower than
peers varies, boys on the WISC, but support for this generali-
Raw scores for each subtest are converted zation is equivocal in {he present review.
to scaled stoves which have a mean of 10 and In evaluating the utility of the Vocabulary and
standard deviation of 3 for each age level. The Block Design short form of the WISC for the Survey
sum of five scaled scores for the Verbal Series it is appropriate to consider shortcomings of these
constitutes the Verbal Scale score (VS), and simi- tests in relation to alternatives that might have
larly the Performance Scale score (PS) is the sum been consideredgiven the constraints of testing
of the five Performance Series scaled scores. The time available in the Survey schedule and the
Full Scale score (FS) is the sum of the Verbal general problems of a national survey. It may be
Scale and the Performance Scale. Deviation in- noted that although the WISC norms are inappro-
telligence quotients have been derived by a sim- priate in varying degrees for Negro, bilingual
ilar conversion process for VS, PS, and FS. The and foreign-born, illiterate, retarded, defective,
IQ scales at each age have a mean of 100 and rural, and other special groups for which the test
standard deviation of 15. was not designed, there is no adequate measure
The standardization of the WISC is reported that can be applied to alL On the other hand,
in Wechslers manual (101), and the standardiza- because of the extensive research on the WISC,
tion sample is summarized in terms of age, sex, reported below, it may be possible to estimate
geographic representation, urban-rural compo- errors in the Vocabulary and Block Design sub-
sition, and composition by socioeconomic status tests and in the scores derived from them for
(reflected by occupation of fathers). fie WISC various components of the Survey sample. In ad-
was standardized on a total sample of 2,200 cases, dition, relationships of these variables to the
including 100 white boys and 100 white girls at Goodenough Draw-A-Man Test offer further op-
each age from 5 to 15 years. The proportion of portunities for compensatory analysis.
urban children in the sample was slightly higher
than in comparable United States population sta- RESEARCH ON SHORT FORMS
tistics.
Reviewers have commented very favorably OF THE WISC
on the WISC as a test of superior quality (102- Several investigators have combined two or
104), but, as in all areas of mental measurement, more subtests in order to develop an efficient
imperfections have been noted and users have short form of the WISC that correlates well with
attempted to employ it for purposes for which it the Full Scale and produces comparable means
was not specifically designed. In gene~al, the and standard deviations (175- 179, 231, and 235).
deviation IQ bas been accepted as an improvement Of these, only one article, by Simpson and Bridges
over the IQ computed by dividing mental age by (177), reported favorable results with the combi-
chronological age. Except for a slight bias for nation of Vocabulary and Block Design. They used

3
a sample of 120 children over the age range of BD scaled scores, 11, multiplied by 5 to prorate
65 to 192 months. the FS score, gives a WIS.C Full Scale IQ of 70
Finley and Thompson (231) developed for a (as compared with the actual mean of 68), while
sample of 309 mentally retarded persons a short the score of 11 in the Simpson and Bridges tables
form with five subtests, including Block Design, yields an FS IQ of 77. Further, in view of Max-
which correlated 0.89 with FS IQ. Significantly, wells criticism of the transformation of raw
their report included correlations of 0.55 and scores to scaled scores (106), it may be advisa-
0.45, respectively, for Voc. and BD with FS IQ, ble also to explore empirically the alternative
while the correlation of Voc. and BD was only 0.1. of predicting the FS IQ from raw scores.
Further, estimation of mean FS IQ by proration of In reviewing the WISC literature every effort
the sum of Voc. and BD, as reported by these was made to focus on the Voc. and BD subtests,
authors, approximated the actual FS IQ quite and considerable data have been assembled.
closely. Nevertheless, the major portion of the information
Schwartz and Levitt (235) also reported a referred to in this report is based on the full test,
short form of the WISC for educable retarded chil- and assumptions of equivalence of short form
dren, consisting of six subtests including Voc. and scores to the Full Scale must be made in gener-
BD which correlated 0.95 with FS IQ. However, alizing the results reported. As indicated above,
their best combination of five subtests, which re- this assumption is not entirely inappropriate, but
duced the correlation to 0.92, eliminated Block caution is certainly indicated.
Design. Osborne and Allen (239), on the other
hand, cross-validated two triads of WISC subtests RELIABILITY AND STABILITY
including Voc. and BD, one with Picture Com-
pletion and one with Picture Arrangement, using Wechslers manual (101, p. 13) reported cor-
samples of 240 (initial) and 50 (validation) retarded rected split-half reliability coefficients of 0.77,
children aged 7 to 14 years, with correlations with 0.91, and 0.90, respectively, for Vocabulary, and
FS IQ of 0.88 to 0.90. 0.84, 0.87, and 0.88, respectively, for Block De-
At the same time, Hite (112) has confirmed sign for samples of 200 children at each of the
Wechslerrs data (101) indicating that Vocabulary following age levels: 7 1/2, 10 1/2, and 13 1/2
and Block Design are the most reliable subtests years. The corresponding FS reliabilities were
in the .WISCbattery. Hagen (109) and Cohen (111) 0.92, 0.95, and 0.94, respectively. As noted above,
in the United States and Gault (110) in Australia these two subtests were the most reliable of all the
have reported that both of these subtests are WISC subtests. These results for Voc. and BD have
highly loaded on the general factor obtained in been confirmed by Hite (112) for children in the
factor analysis of the WISC over the entire age age range of 5 to 7 years.
range of 5 to 15 years. Cohen found that Vocabu- Stability of the WISC on retest has also been
lary was the strongest single measure of the found satisfactory by Gehman and Matyas (113)
general factor. Nevertheless, a problem exists in over a 4-year period (age 11 years at initial test),
determining the optimal combination of these sub- by Reger (115), who tested a sample at ages 10,
tests to estimate the FS IQ and various parameters 11, and 12 years, and by Whatley and Plant (116),
related to the Survey objectives. who used a 17-month interval. In these studies,
Simpson and Bridges (177 ) estimated the FS retest correlations were generally of the order of
IQ on the basis of a simple sum of the scaled the corrected split-half reliabilities. These and
scores of Voc. and BD and reported a conversion related data are summarized in table 1.
table for this purpose. Inasmuch as their results
have not been replicated, so far as is known, VALIDITY
cross-validation on a substantial sample should
be considered before this table is adopted. The Despite the fact that Wechsler developed the
importance of this recommendation is illustrated WISC in protest against the measurement concept
by some computations based on the Finley and of mental age (and the IQ based on it) implicit in
Thompson data (231 ). The sum of mean Voc. and the Stanford-Binet test, ~,ld despite the additional

\
Table 1. Studies reporting reliability coefficients of the WISC


Number Coefficient
Investigator Yea Subjectsa Age range Type of
coefficient
x M F Voc , BD Vs Ps FS

Throne, Schulman, and 196 Retarded---------- 11-0 - 14-1 39 39 0.7$ 0.8: 0.9: 0.8$ 0.95 Test-retest
Ksspar (227).

Armstrong (175)--------- 195 Guidance clinic--- 5-o - 14-1 200 100 10C 0.94 N.R. N.R, N.R, N,R. Split-half,
Spearman-
5-7 years 20 20 0.9: N.R. N.R, N.R, N.R. Brown
5-7 years 20 2C 0.9C N.R. N. R, N.R, N.R.

7-9 years 20 20 0.9: N.R. N.R. N.R, N.R.

7-9 years 20 20 0.91 N.R. N.R. N.R. N. R.


9-11 year~ 20 20 0.87 N.R. N.R. N.R, N. R.

9-11 year: 20 20 0.85 N.R. N.R. N.R. N.R.

11-13 yearl 20 20 0.8E N. R. N.R. N.R. N.R.

11-13 yearf 20 20 0.88 N.R. N.R. N.R, N.R.

13-15 yearl 20 20 0,90 N. It. tJ. R. N.R. N.R.

13-15 year! 20 20 0.96 N.R. N.R. N.R, N.R.

Gehman and Matyas 195 Normals----------- 11-1 60 29 31 N.R. N. R. 0.77 0.7L 0.77 Test-retestb
(113).

Caldwell (252)---------- 195 Normals (Negro) --- 9-7 - 10-6 60 .e- --- 0.70 0.89 0.85 D.9C 0.84 Split-half

Jones (154)------------- 196 Normals (England)- ------------ 240 L20 120 ---- ,---- ,---- ---- ----, Sp~l~=half,
7-6 - 8-5 80 &o 40 0.70 0.74 0.8f D.8C 0.89 Richardson
8-6 - 9-5 80 40 40 0.70 0.68 0.87 D.81 0.90

9-6 - 10-5 80 40 40 0.70 0.75 0.90 >.85 0.94

Wechsler (101)---------- 194 Normals (WISC ------------ jOO 100 300 ---- ---- ,---- ---- ----- Split-half,
standardization Spearman-
data). Brown
7-6 100 100 100 0.77 0.84 0.88 ).86 ).92

10-6 ?00 LOO 100 0.91 0.87 0.96 ).89 1.95

13-6 ?00 LOO 100 0.90 0.88 0.96 1.90 ).94

Hate (112)-------------- 195 Normals ----------- ------------ ?00 117 83 ---- ---- ---- ----- Split-half
5-6 50 34 16 0.71 0.77 0.77 ).81 1.90

6-6 :00 56 44 0.72 0,84 0.89 ).89 ).91

7-6 50 27 23 0.76 0.89 0.89 ).86 ).94

Haven (109)=------------ 195: !iormals(WISC ----------- 100 !00 200 ---- ---- ---- ---- ----- Split-half,
standardization Spearman-
data). Brown
5 yeara !00 .00 100 0.68 0.77 N.R. T.R. !J.R.

15 years !00 .00 100 0.91 0.89 N.R. ~.R. ~.R.



Designationsof subjects are always whfte Americans unless otherwise apecified.
bT~e between testings was 49 Onths.
CData are from the WISC standardizationsample, but were not reported in the WISC manual
NOTES: All correlation coefffcient~ are Pearson Product-Momentunless otherwise specified.
2 Total population;M-male; F-f emale; Voc.-Vocabulary; BD Block Design; VS-Verbal Scale; PSPerf ormance
Scale; FS-FU1l Scale; N.R.not reported.

5
Table 2. Studies reporting correlation between the WISC and Stanford-Bineta

_ . =L . -.
Number Correlation
Investigator Year Subjectsb Age range
2 M F Voc. BD Vs Ps FS
.
Nale (216)------------------- 1951 Mental defectives---------- :-10 - 15-11 04 54 50 N.R. N.R. N.R. N.R. 0,91
Stacey and Levin (228)------- 1951 Mental defectives---------- 7-2 - 15-11 70 --- --- N.R. N.R. N.R. N.R. 0.68
Sloan and Schneider (217)---- 1951 Mental defectives---------- N.R. 40 20 20 N.R. N.R. 0.75 0.64 0.76

Orr (188)-------------------- 1950 Retarded------------------- N.R. 10 --- --- N.R. N.R. 0.81 0.49 0.71

Sharp (229)------------------ 1957 Slow learners-------------- 8-O - 16-5 50 --- --- N.R. N.R. 0.62 0.67 0.69

Post (198)------------------- 1952 Stutterers----------------- 5-5 - 15-10 30 27 3 N.R. N.R. 0.80 0.37 0.78
Kent and Davis (207)--------- 1957 Normals and clinic referral:
(England)----------------- 8-12 years :13 L33 80 N.R. N.R. N.R. 0,58 N.R.
Norma is------------------ ----------- 18 59 59 ---- ---- ---- ---- -----
Delinquents -------------- ----------- 55 48 7 ---- ---- ---- ---- -----
Psychiatric outpatients-- ,----------- 40 26 14 ---- ---- ---- ---- -----

Muir (119)------------------- 1952 Institutional (orphans and


various problems)--------- 5-O - 6-11 42 --- --- N.R. N.R. 0.46 0.52 0.62
5 years 21 --- --- N.R. N.R. 0.65 0.66 0.74
6 years 21 --- --- N.R. N.R. 0.44 0,39 0.49
Davidson (162)--------------- 1956 Normals --------------------- 14-0 - 14-3 30 --- --- N.R. N.R. 0.79 0,71 0.83

Kardos (161)----------------- 1954 Normals-------------------- .1-11 - 13-0 00 50 50 N.R. N.R. 0.87 0.82 0.89

Matyasd (114)---------------- 1954 Normals-------------------- ----------- 60 29 31 ---- ---- ---- ---- -----
11-1 (mean)
Normals-------------------Grade
5------------------ 60 29 31 N.R. N.R. 0.78 0.46 0.73
Grade 9 (retest)--------- 15-2 (mean) 60 29 31 N.R. N.R. 0.76 0.64 0.77

Raleigh (191)---------------- 1952 Normals-------------------- .0-8 - 14-9 ,00 52 48 N.R. N.R. 0.77 0.59 0.80
.%hwitzgoebel (189)---------- 1952 Normals-------------------- -11 - 13-8 00 52 48 N.R. N.R. 0.78 0.61 0.84

Clarke (160)----------------- 1950 NOmals -------------------- 9-7 - 12-9 84 39 45 N.R. N.R. 0.83 0.57 0.79
Frandsen and Higginson (159)- 1951 Nonmals-------------------- 9-1 - Lo-3 54 --- --- N.R. N.R. 0.71 0.63 0.80

Reidy (171)------------------ 1952 ------------------- 9-o -


Nomads --------------------- 11-11 60 30 30 N.R. N.R. 0.87 0.69 0.86

Jones (154)------------------ 1962 Normals (England)---------- 8-1o years !40 L20 120 N.R. N.R. 0.84 0.59 0.81
8 years 40 40 N.R. N.R. 0.77 0.48 0.72
8 years 40 40 N.R. N.R. 0.79 0.46 0.76
z 8 years 80 40 40 N.R. N.R. 0.78 0.47 0.74
9 years 40 40 N.R. N.R. 0.89 0.65 0.90
9 years 40 40 N.R. N.R. 0.78 0.58 0.75
X 9 years 80 40 40 N.R. N.R. 0.84 0.61 0.84
10 years 40 40 N.R. N.R. 0.86 0.64 0.83
10 years 40 40 N.R. N.R. 0.90 0.67 0.86
Z 10 years 80 40 40 N.R, N.R. 0.88 0.66 0.85
Arnold and Wagner (158)------ 1955 Norma is -------------------- 8-9 years 50 --- --- N.R. N.R. 0.85 0.75 0.88
Wagner (156)----------------- 1951 Nomals--------------_-__-- 8-9 years 50 --- --- N.R. N.R. 0.77 0.87 0.81
Scott (155)------------------ 1950 Normal
s-------------------- 7-7 - 11-1 30 --- --- 0.63 0.60 0.86 0.86 0.92
8eeman (153)----------------- 1960 Nomals-------------------- 7-2 - 11-9 36 --- --- N.R. N.R. 0.64 0.42 0.67

karlow, Price, Tatham, and 1957 Nomals -------------------- ----------- 60 --- --- ---- ---- ---- ---- -----
Davidson (145). --- ---
6-6 - 6-7 30 N.R. N.R. 0.64 0.61 0.64
.0-0 - Lo-1 30 --- --- N.R. N.R. 0.88 0.52 0.83
Cohen and Collier (124)------ 1952 Normals-------------------- 6-5 - 8-9 51 --- --- N.R. N.R. 0.82 0.80 0.85
Tatham (152)----------------- 1952 Normals-------------------- 6-5 - 6-7 30 --- --- N.R. N.R. 0.64 0.51 0.64

Mussen, Dean, and Rosenberg 1952 Normals-------------------- 6-o - 13-1 39 --- --- N.R. N.R. 0.83 0.72 0.85
(117).

See footnotes at end of table.

6
Table 2. Studies reporting correlation between the WISC and Stanford-BinetCon.

Investigator

Krugman, Justman, wright-


Yeal


Subjectsh Age range
Number


M F Voc .
..
BD
.

----
---
Correlation

FS

T
stone, and Krugman (144)---- 1951 Normals-------------------- ----------- 222 .-. --- ----- ---- -----
6 years 38 .-. --- N.R. N.R. 0.73 0.74 0.82
7 years 43 --- --- N.R. N.R. 0.64 0.49 0.73
g yeara 44 --- --- N.R. N.R. 0.78 0.57 0.82
9 years 31 --- --- N.R. N.R. 0.83 0.79 0.87

------
----
10 years 29 --- --- N.R. N.R. 0.S8 0.54 0.86
11 years 37 --- --- N.R. N.R. 0.69 0.53 0.76

-----
I----
pastOvic((121)-------------- 1951 Nomads -------------------- .--------.-, LOO --- --- ----- .---- ------
5-6 50 --- --- N.R. N.R. 0.63 0.57 0.71
7-6 50 --- --- N.R. N.R. 0.82 0.71 0.88
Winpenny (105)--------------- 1951 Nomads -------------------- .----------
, --- --- .--.-, ----- -----

1
185
Wndergarten-------------- 5-4 - 5-8 50 --- --- N.R. N.R. N.R, N.R. 0.71
Grade 2------------------- 7-4 - 7-8 50 --- --- N.R. N.R. N.R. N.R. 0.86
Grade 5------------------- 9-7 - 12-9 85 --- --- N.R. N.R. N.R. N.R. 0.79
Dunsdon and Roberts (170)---- 1955 Normals (England)---------- 5-O - 14-11 ,%7 180 967 ------ ----- ----- ----- -----
98o 980 N.R. N.R. N.R. N.R. 0.82
967 967 N.R. N.R. N.R. N.R. 0.77
1 oruszak (L46)--------------- 1954 Nomads -------------------- 5-14 years 80 40 40 N.R. N.R. 0.87 0.78 0.90
5-14 years 40 40 N.R. N.R. 0.89 0.72 0.93
5-14 years 40 40 N.R. N.R. 0.86 0.71 0.93
olland (L49)---------------- 1953 -------------------N omads---------------------
5-13 years 52 --- --- N.R. N.R. 0.88 0.73 o.g7
eider, NOller, and Schrauma
(150)------------------------1951 Normals-------------------- ;-O - 11-11 106 --- --- N.R. N.R. 0.89 0.77 0.89
,-0 - 7-11 44 --- --- N.R. N.R. 0.82 0.79 0.90
:-0 - 11-11 62 --- --- N.R. N.R. 0.92 0.78 0.90

kreth, Muhr, and Weisgerber


(1L8)----------------------- L952 ------------------- Nomad s--------------------
5-6 years 100 ,-- --- 0.51 0.61 0.75 0.71 0.81
5 years 50 ,-- --- 0.42 3.65 3.79 0.73 D.84
6 years 50 -- --- 0.65 1.55 1.71 0.71 3.79
{ottersman(151)------l_-_- 1950 -------------------Nomads-------------------- 6 years 50 21 29 N.
R. N.R. 0.71 0.49 0.71
Criggs and Cartee (148)------ 1953 Normals (S-B, Form M)------ 5 years 46 --- --- N.R. N.R. 0.58 0.48 0.61

Xr (188)---------------- 1950 Nomads -------------------- ------------ 40 --- --- ------ ----- ---------- -----

I
Grade 1------------------- N.R. 15 .-. --- N.R. N.R. 0.63 0.62 0.77
Grade 4------------------- N.R. 14 .-. --- N.R. N.R. 0.64 0.65 0.67
Grade 7---------------- N.R. 11 --- --- N.R. N.R. 0.88 0.66 0.79

---
----.
,
Stanley (157)-------------- L955 Normals (from Frandsen and
,-- --- N.R. N.R. N.R. N.R.

.-------
Higginson, 159, above)---- N.R. 50 D.71
Schachter and Apgar (147)---- .958 Normals, mixed aample------ N.R. 113 61 52 N.R. N.R. >.64 0.48 3.67
~ite --------------------- -------.---. 39 -- ..- ----- ----- ---------- -----
Negro--------------------- 66: -- .--, --- .-, ----- -p.. -----
Puerto Rican------------- .--------- . 61 -- .---- .---- ----- -----

1
---------- . ----- ----- -----,
Oriental------------------
21 -- -----

Estes, Curtin, DeBurger, and


Denny (L25)---------------- 1961 Normala, Grades 1-8-------- ,----------- 82 47 35 ------ ----- ---------- -----
Form L-------------------- N.R. 82 47 35 N.R. N.R. N.R. N.R. 0.80
Form L-M------------------ N.R. .82 47 35 N.R. N.R. N.R. N.R. 0.74
i
~thless otherwise noted, Stanford-Binec, FO~ L-
.
Designation of subjects are always white Americans unLess othemis~ specified.
Rank difference correlation. dAISO reported by Gehman and Matyas in 1956.
Also reported by Pastovic and Guthrie in 1951. Intraclasscorrelation.
Average time between S-B and WISC administrationwas 50.8 months.
-
NOTES: All correlation coefficients are Pearson product+.f~ent unless othe~ise a~e=ified.
z Total population;Mmale; F-female; VOC .-Vocabulary; BDBlock Design; Vs-verbal Scale; PSPerformance
scale; FSFU1l Scale; N.R.-not reported.

7
Table 3. Studies reporting correlation between the WISC and other measurea

Number Correlation
Investigator Yeax
Test or criterion
Subjects Age range
variable
z M 1 Voc , BD Vs Ps FS

Smith (126)------- 1961 Full Range Picture Normals ----------- 6-11 - 8-lC !Oc 5 t N.R. N,R. 0.6: 0.4i 0.60
Vocabulary Test.

McBrearty (123)--- 1951 Arthur Point Scale Normals ----------- 10-3 - 12-1 5: 2: N.R, N.R. N.R. 0.6: 0.71
of Perfonnauce
Tests.

Cohen and Collier 1952 Arthur Point Scale Nonnals


----------- 6-5 - 8-9 45 -.. . . N.R. N.R. 0,77 0.81 0,80
(124). of Performance
Tests.

Winpenny (105)---- 1951 Arthur Point Scale Normals ----------- 9-7 - 12-$ 8! .-. -. N.R, N.R, N.R. N.R. 0.70
of Performance
Tests.

Armstrong and Hauc!$ 196C Visual Motor Ge- Nonorganic child 6-12 year: 9[ 4 f N.R, N.R, .0.22 .0.0; .0.23
(130) . stalt Test. fl~~n~e pOpu-

Winpenny (105)---- 1951 Bernreuter-Winpenn Normals----------- ----------. .-. -.. -. ----- ----- ----- ----- -----
Kindergarten---- 5-4 - 5-8 SC ... -. N.R, N.R, N.R, N.R. 0.92
Grade 2--------- 7-4 - 7-8 SC --- -. N.R. N.R. N.R. N.R. 0.92
Grade 5--------- 9-7 - 12-! 8: --- -. N.R. N.R. N.R. N.R. 0.97

Cooper (242)------ 195: California Achieve Bilinguals N.R. 51 ... -. N.R. N.R. o.ac r3,5L 0.77
ment Tests. (Guam),Grade 5.

Altus (122)------- 1952 California Test of Normals, junior N.R. 5: .-. -. N.R. N.R. N.R. N.R. 0.81
Mental Maturity. high.

Altos (134)------- 19x California Test of Retarded,eLemen- N.R. .Oc 71 2 ----- ..... ----- ----. -----
Mental Maturity tary school.
Language -------- ------------------. ----------- .-. -.. -. N.R. N.R. 0.71 0.5; 0.70
Non- language---- ------------------ . ----------- .-. .-. -. N.R, N.R. 0.6: 0.67 0.68
TOtal ----------- ----------. ,-. .-. -. N,R. N.R, 0,7[ 0,6[ 0.77

Cooper (242)------ 195: California Test of Bilingual N.R. 51 --- -. N.R. N.R. 0.66 0.6$J 0.74
Mental Maturity. (Gu@ , Grade 5.
Schwitzgoebel 1955 California Test of Normals ----------- 9-11 - 13-C ,00 52 4 N.R. N.R. 0.55 0.59 0.75
(189). Mental Maturity.

Barratt (138)----- 1956 Columbia Mental Normal s----------- 9-2 - 10-1 60 26 0.45 0.47 0.56 O.48 0.61
Maturity Scale.

Warren and Collier 1960 Columbia Mental Retarded ---------- 9-30 years 49 .-. .. N.R. N.R. N.R. N..R. 0.68
(224). Maturity Scale.

Thompson (193)---- 1961 Gates Advanced Normals ----------- 6-4 - 8-O .05 62 4 ,---- ----- ,---- ,---- ----
Primary Reading
Tests.
Word Recognition ------------------ ----------- .. .-. -- N.R. N.R. 0.58 0.42 0,55
Paragraph Readirq ------- ---------- . ----------- ,-- --- .- N.R. N.R. 0.55 0.46 0.56
Composite Readinj .--.-------.--..-- ----------- ,-. -.. .. N.R. N.R. 0.57 0.47 0.58
Warren and CO1lier 1960 Goodenough Intelli Ret~r&d--------- 9-30 years 49 --- -. N.R. N.R. N.R. N.R. 0.43
(224). gence Test.

Armstrong and 1960 Goodenough Intelli Child guidance 6-12 years 98 45 4 N,R. N.R. 0.37 0.51 0.49
Hauck (130) . gence Test. clinic.

Rottersman (151)-- 1950 GoOdenOugh Intelli Normals----------- 6 years 50 21 i N.R. N.R. 0,38 0.43 0.47
gence Test.

Kimbrell (136)---- 1960 Grade placement--- Mental defec- 10.5 - 15.a 62 .-. .. N.R. N,R. N.R. N.R. 0.40
tive.
Smith (126)------- 1961 Wide Range Normals----------- 5-11 - 8-10 ,00 51 4 N.R. N.R. 0.55 1).t,7
0,61
Achievement Test.

Delp (135)-------- 1953 Kent EGY Test------ Normal s----------- 6-15 years 74 --- .- N.k. N.R. 0.60 0.55 0.62

Cooper (242)------ 1958 Leiter Interna- Bilinguals N.R. 51 .-. .. N.R. N.R. 0.73 0.78 0.83
tional Perform- (Guam), Grade 5.
ance Scale.

Sharp (229)------- 1957 Leiter Interna- Slow learners----- 8-O - 16-5 50 ... .- N.R. N.R. 0.78 0.80 0.83
tional PerfOrm-
1 ante Scale.

See footnotes at end of table.

8
Table 3. Studies reporting correlation between the WISC and other measures-Con.

Number Correlation

Investigator Year
rest or criterion
variable Subjects$ Age range
z M F Voc . BD w Ps FS

Alper (221)------- 3.958 Letter Interna- Mental defec- 7-2 - 17-3 30 15 15 N. il. N.R. 0.40 0.79 0.77
tional Perform- tive.
ance Scale.
Dunn and Brooks 1960 Peabody Picture Retarded-------- N.R. 56 ..- --- N.R. N.R. N.R. N.R. 0.61
(234). Vocabulary Test.
Kimbrell (136)---- 1960 Peabody Picture Maritaldefec- 10.5 - 15.8 62 ..- --- N.R. N.R. N.R. N.R. 0.30
Vocabulary Test. tive.
Himelstein and 1962 Peabody Picture Emotionally 6-2 - 14-8 48 --- --- N.R. N.R. 0.64 0.52 0.63
Herndon (137). Vocabulary Test. disturbed.
McBrearty (123)--- 1951 Progressive Normals --------- 10-3 - 12-11 52 22 30 li.
R. N.R. 0.78 0.50 0.81
Achievement
Tests.
Dunsdon and 1955 fill Hill Vocabu- Normals 5-o - 14-11 947 180 ?67 ,----- .---- ------
Roberts (170). lary Scale. (England).
Form A--------- _L---_._----- ------------ 980 ?80 0.83 N.R. N.R. N.R. N.R.
Form A--------- ---------------- ------------ 967 367 0.81 N.R. N.R. N.R. N.R.
Form B--------- ---------------- .-.--------- 980 180 0.85 N.R. N.R. N.R. N.R.
Form B--------- ---------------- .----------- 96J 967 0.82 N.R. N.R. N.R. N.R.
Brown, Hakes, and 1959 Raven Progressive Retarded-------- N.R. .R. --- --- N.R. N.R. N.R. N.R. 1.39-
Nalpass (233). Matrices. 0.49
Malpass, Brown, 1960 Raven Progressive Retarded-------- L1-8 (mean) 104 --- --- N.R. N.R. N.R. N.R. 0.51
and Hakes (140). Matrices.
Barratt (138)----- 1956 Raven Progressive NOrmals --------- 9-2 - 10-1 60 26 34 0.56 0.60 0.69 0.70 0.75
Matrices.
Wilson (139)------ 1952 Raven Progressive British Columbia 5-6 - 13-0 90 --- --- N.R. N.R. N.R. N.R. ------
Matrices. Hospitalized 30 --- --- ----- .---- .---- ---- :0.75
Americans 0.27
Indians.
30 --- --- ----- ,---- ,---- ---- :0.83
0.42
High socioeco- ------------ 30 --- --- ,----- ----- ----- ---- ::.;;
nomic whites.

Martin and Wiech- 1954 Coloured Progres- Normals --------- 9-0 - lo-o 100 60 40 0.73 0.74 0.84 0.83 0.91
ers (142). sive Matrices.
Stacey and Carle- 1955 Coloured Progres- Mental defec- 7-5 - 15-9 150 --- --- N.R. 0.54 0.52 0.55
ton (141). sive Matrices. tive. ::;; 0.41 0.51 0.55 0.62
Hite (112)-------- 1953 SRA Primary Mda: Normals--------- 5-6 years 50 34 16 ----- ,---- ----- ---- .-----
Abilities Test.
Verbal --------- ---------------. .----------- ---- ---- .-. 0.45 0.38 N.R. N.R. N.R.
PerceptiOn----- ---------------- .----------- ---- ---- ---- 0.30 0.83 N.R. N.R. N.R.
Quantitative--- ---------------- .----------- ---- ---- ---- 0.35 0.53 N.R. N.R. N.R.
Space---------- ---------------- ------------ ---, ---- ---- 0.39 0.68 N.R. N.R. N.R.
Stempel (143)----- 1953 SRA Primary MEZItal Superior 8-5 - 10-4 50 --- .-. ----- ---- ----- ---- ------
Abilities. intelligence.
Space---------- .----------- ---- ---- ---- N.R. N.R. 0.45 >.34 N.R.
Number --------- ---- ---- ---- N.R. N.R. 0.15 3.3B N.R.
Reasoning------ ---- ---- ---- N.R. N.R. 0.63 D.55 N.R.
Perception----- ---- ---- ---- N.R. N.R. 0.18 0.42 N.R.
Verbal--------- .----------- ---- ---- .--, N.R. N.R. 0.68 3.40 N.R.
IQ------------- ---------------- ------------ ---, ---- ---- N.R. N.R. N.R. N.R. 0.68
Jones (154)-----*- 1962 Teacher ratings-- Normals 7-6 - 10-5 240 120 120 N.R. N.R. 0.73 0.57 0.74
(England).
8 years 80 40 40 N.R. N.R. 0.70 ).48 0.70
9 years 80 40 40 N.R. N.R. 0.71 ).59 0.73
10 years BO 40 40 N.R. N.R. 0.76 3.62 0.76
Stark (163)------- 1954 rhe Drawing- NOnnals --------- 8-4 - 9-10 50 30 20 0.72 o.f+9 N.R. N.R, 0.79
Completion Test.
Bacon (127)------- 1954 Jechsler-Bellevue Normals --------- .1-9 - 12-3 32 16 16 0.84 0.65 0.86 9.65 0.77
Intelligence
Scale, Form 1.
Delattre and Cole 1952 dechsler-Bellevue Normals --------- .0-5 - 15-7 50 --- --- 0.55 0.49 0.86 ).82 0.87
(128). Intelligence
Scale, Form I.
-.
Designation of subjects are always white Americans unless otherwise specified. bETA coefficient.
CWISC scaled scores. dpartial correlations with chronological age removed.
Raw scores. rScaled scores.
NOTES: All correlation coefficients are Pearson Product-Moment unless otherwise specified.
X Total population; M-male; F-female; Voc. -Vocabulary; BD-Block Design; VS-Verbal Scale; PSPerformance
Scale; FSFU1l Scale; N.R.not reported.

9
fact that the validity of the WISC must be judged view of these variations, the specific coefficients
principally in relation to the logic of Wechslers are of less interest than the general trend, which
approach and the adequacy of his development and supports the validity of the WISC as a general
standardization of the test, a surprisingly large measure of what Wechsler labels the total effec-
number of papers dealing with the validity of the tive intelligence of the individual (101, pp. 4 and
WISC have used the Stanford-Binet asacriterion. 5).
As may be expected, unless one assumes naively For the purposes of a national survey, the
that the theoretical objections to mental age scores robusmess of the validity data over wide sample
involve gross discrepancies, which they usually fluctuations is very encouraging, as is revealed
do not, the correlations between WISC Full Scale by its use on samples of varying geographic and
IQs and Stanford-Binet IQs are generally high, ethnic characteristics, of varying abilities ranging
in about the same range as the respective reli- from defective to gifted samples, and by its use
abilities of these tests. (See table 2.) There seems with special groups such as retarded readers
to be little doubt that both the WISC and the Stan- (133), bilingual (242), stutterers (198), and low
ford-Binet merit their reputations as outstanding school achievers (190).
individual intelligence tests.
There are, however, differences between the FACTORS AFFECTING WISC SCORES
WISC and Stanford- Binet in score levels. As noted
above, the WISC IQs tend to be substantially lower Both qualitative and quantitative variations in
than the corre spending Stanford- Binet IQs for the WISC scores have been reported by various inves-
very young and for the gifted (153 and 215), as tigators in relation to a wide range of factors.
well as for many samples reported across the Those discussed in this section are considered
normal range (119, 120, 124, 147, 148, 151, 154, relevant to the objectives and problems of the
156, 159, and 161). This problem is discussed Survey. Where feasible and appropriate, implica-
below. tions and recommendations are noted.
The WISC has been correlated with a wide
range of verbal and performance tests that pur- Anxiety
port to measure various aspects of intelligence.
Correlations with the Wechsler-Bellevue, Form Hafner, Pollie, and Wapner (132) and Carrier,
I, have been reported by Bacon (127) for a sample Orton, and Malpass (205) have both reported nega-
of 36 children in the age range 11 years 9 months tive correlations between the WISC FS and the
to 12 years 3 months and by Delattre (128) for 50 Childrens Manifest Anxiety Scale (CMAS), indi-
students aged 10-5 to 15-7. Their results for FS cating that anxiety, as measured by this scale,
were 0.77 and 0.87, respectively, while both corre- tends to interfere with effective WISC perform-
lated 0.86 for VS. For PS their respective corre- ance. Hafner and others found a significant corre-
lations were 0.65 and 0.82; for Voc., 0.84 and 0.55. lation of -0.31 between CMAS and BD.The Carrier
Finally, for BD their results were 0.65 and 0.49. study observed the relationship (-0.54 ) over a
Variations of the magnitude indicated must be ex- range of ability but not among the exceptionally
pected for small samples from different settings. bright. It appears to be most marked in the sub-
Dunsdon and Roberts (170) administered four normal; Feldhusen and Klausmeier (167) found the
vocabulary tests including the WISC to 2,000 following mean differences in CMAS scores for
British children and obtained intercorrelations three groups at different IQ levels: low IQ, 20.2;
exceeding 0.8 for both sexes. average, 14.8; and high, 12. These results are not
Table 3 summarizes reported correlation entirely consistent with those of Burns (206), how-
coefficients between WISC scores and other tests ever, who found similar correlations between
of intelligence, mental matiirity, and achievement WISC Vocabulary and California Personality Test
in school subjects, teacher ratings, and related measures of Social Adjustment (0.55) and Personal
criteria. For the FS IQ these are generally quite Adjustment (0.45) but obtained nonsignificant co-
high and positive, considering sample size and efficients of 0.12 and 0.10, respectively, for Block
variation in sample composition and setting. In Design.

10
Although anxiety and adjustment maybe re- (169). The correlations of WISC FS and VS IQs
garded generally as factors that tend to depress with the spelling subtest of the Iowa Test of Basic
WISC (Voc. and BD) scores for some segments of Skills were higher for boys than for girls. No data
the child population on some occasions, it would were reported in which sex differences favored
seem unwise to attempt any correction for these girls. The absence of sex differences in studies of
factors. Presumably, some valid evidence on ad- normal American (146) and English (154) children,
justment will become available from the Thematic deaf American (194) and English (196) children,
Apperception Test (TAT), the School Information and retarded American children (232) suggests
Form, and the extensive background and medical considerable generality for the negative con-
information being collected in the Health Exam- clusion.
ination Survey. However, the relationships are not
clearly enough defined for fine quantitative manip- Qualitative Differences by Level
ulation. One alternative is to regard fluctuations
on these variables as a source of error which Gallagher and Lucito (164) found a negative
may possibly be crudely estimated later but is rank order between the mean scores of gifted
probably well randomized in the total sample. and retarded children on the WISC. The three
Another is to accept the error pragmatically with highest and three lowest subtests for five com-
the attitude that depressed scores resulting from parison groups in their study are shown below.
affective factors probably reflect depressed a- These results agree with others, to be discussed
biIity of the individual to function effecr.ively. below, which indicate that Block Design scores
are least affected by population variations, in
Sex Differences contrast with Vocabulary, which is the highest
test of the gifted groups and the lowest of the re-
The statement by McCandless (103), cited tarded.
earlier, that boys do better on the WISC than girls,
Baroff (223) described a WISC profile for a
is not supported by the present review. Data on sample of 53 low-IQ patients with a mean FS IQ
sex differences are presented in nine studies of 63; Block Design was highest, and Vocabulary
(130, 146, 154, 169, 175, 192, 194, 196, and 232),
ranked 11 out of 12. Although Fisher (225) failed
and only one (130) reports a significant mean dif-
to verify the Baroff patterning, Baroffs results
ference favoring boys on FS IQ. However, none of
are in agreement with those of Gallagher and
them employed a sampling design encouraging
Lucito with respect to Vocabulary. Matthews (230)
confidence in the group comparisons.
found that nonachievers in school tend to be higher
Some correlational differences mentioned by
on Block Design than on Vcmabulary. Levinson
several authors do appear interesting: The cor-
(243 and 244), working with Jewish children in
relation of WISC Full Scale IQ with Bender-Gestalt
New York, and Altus (240), with Mexican and
was negative and higher for boys (-0.34 p<O.01 )
Anglo-American children in California, both found
than for girls (-0.09 ns) (130). The correlation of
that monolingual exceeded bilingual on Vocabu-
WISC Full Scale IQ with the Ammons Picture Vo-
lary, but that the differences on Block Design
cabulary Test was 0.71 for boys and 0.45 for girls

L2Q!!l? Number of Three highest subtests Three lowest subtests


classification subj ects (N)
1 Gifted ------ 50 Similarities, Information, Picture Completion, Picture
Vocabulary Arrangement, Digit Span
2 Gi_fted ------ 43 Vocabulary, lnformaticm, Picture Completion, Picture
Similarities Arrangement, Digit Span
3 Average ----- 565 Arithmetic, Digit Symbol, Block Des @n, Information,
Picture Arrangement Similarities
4 Retarded---- 150 Object Assembly, Picture Information, Vocabulary,
Completion, Digit Span Arithmetic
5 Retarded---- 52 Object Assembly, Digit Vocabulary, lnf ormat ion,
Span, Picture Completion Picture Arrangement

11
were not significant. Burks and Bruce (186) quotients grow taller than those in the average or
found that poor readers score significantly high low range, but that weight is not significantly re-
on Block Design, and Kallos, Grabow, and Guarino lated to sex or IQ. On stvength of gtip, they
(180) obtained a significant difference between found low- IQ children weaker than those with
Block Design and Vocabulary, favoring Block De- average or high IQs, the average group weaker
sign, for a sample of poor readers. than the high- IQ group, and girls weaker than
Results such as these suggest the possibility boys. Girls were found to have more pewnanent
of investigating a Voc. -BD ratio which may prove teeth and a higher caypal age than boys of the
to have some diagnostic use, in conjunction with same age. No sex differences or IQ differences
the Goodenough Draw-A-Man Test, the Wide Range were found in relation to emotional adjustment.
Achievement Test (WRAT), the Thematic Apper- Girls also exceeded boys on achievement in
ception Test, and school information, in evaluating Yelation to capacity, integration of self concept,
various categories of subnormal and deviant per- and estimation of own ability. These observations
formance such as those enumerated above. are of interest in suggesting cross-disciplinary
On the Vocabulary subtest, Stacey and Port- analysis of psychological and biomedical data.
noy (168) also observed qualitative differences
between a borderline group (IQ range 66-79) and SPECIAL GROUPS
a defective group (IQ range 50-65) in conceptual
approaches to word definition. Defective ex- The following discussion includes research on
ceeded borderlines significantly in the use of the WISC with reference to a number of special
functional definitions, while the borderlines were groupsthose involving various disabilities, af-
significantly higher in use of descriptive defini- flictions, deviations, social and ethnic character-
tions. Neither group used abstract concepts to istics, and other definitive attributes commonly
more than a slight degree. recognized in the literaturefor which at least
Carleton and Stacey (219) made an item anal- some information has been found. Each of these
ysis of the Vocabulary and Block Design subtests groups involves some variables which affect
with a sample of 366 low-IQ children (mean FS WISC scores, and this review might properly
IQ 67) and found four Voc. items and two BD items have been included in the preceding section.
displaced. In view of the greater dependence on However, most of the research referred to here
these twc subtests in a short form than is usually was organized in terms of samples of persons in
required with the full test, consideration might various categories rather than by underlying
well be given by the Survey staff to a repetition variables. As a result, the organization of the
of this study for a substantial sample. discussion follows the organization of the material
Maxwell (21 1) observed that the WISC vari- reviewed.
ances for a sample of neurotic children were
greater than for a normal sample, which led him Reading Disability
to criticize the transformations of raw scores to
scaled scores. This point was also made by Wilson As noted earlier, Kallos and others (180)
(139), whose work was with Indian children. Walker found that Block Design scores were significantly
(209), in a highly creative study, enumerated a higher than Vocabulary scores for a reading dis-
lengthy list of qualitative variations of WISC re- ability sample of 37 boys aged 9 to 14 years whose
sponses that appear to have promise for person- IQs ranged from 90 to 109. The elevation of BD
ality diagnosis. Walkers study merits further was supported by Burks and Bruce (186). Altus
followup. (181), Sheldon and Garton (182), and Karlsen (185)
published WISC profiles for retarded readers,
Developmental Factors based on small but similar groups. No consistent
pattern is unequivocally shown. Robeck (183) used
Klausmeier and Check (166) investigated a a more sophisticated method to study subtest
number of developmental correlates of the WISC. patterning of problem readers on the WISC, repre-
They reported that children with high intelligence senting subtest scores as deviations of scaled

12
scores from the respective age-group means. By For deaf children, then, the pantomime instruc-
this method problem readers were significantly tions are appropriate on BD.
higher than the norms on botli Block Design and Glowatsky (194) found that WISC Performance
Vocabulary (as well as on Comprehension, Simi- Scale IQs were comparable with Draw-A-Man
larities, and Picture Arrangement) and lower on Test IQs for a sample of 24 deaf and hard-of-
Digit Span, Arithmetic, Information, and Coding. hearing chi&en in Santa Fe. PS scores were sub-
Rogge (187) reported no significant differences stantially higher than VS scores in this group, but
on WISC VS, PS, or FS IQs between a sample of bilingualism (noted in 13 cases) was not a factor.
132 delinquents 14 to 16 years of age and a control Thompson gave Wepmans Auditory Discrim-
sample of good readers. ination Test, the WISC, and other tests of reading
Correlations of WISC scales with reading and auditory acuity to 105 children, including good
tests are generally moderate, ih the range of 0.3 and poor readers. She found that a significant and
to 0.5 (171, 172, and 173). On the other hand, ap- substantial proportion of first graders (71 percent)
proaches involving score patterns or profiles, had inadequate auditory discrimination, but that
such as discussed above, and qualitative analyses this number was reduced to 24 percent by the
of responses, exemplified by the analyses Of the second grade. Auditory Discrimination scores
understanding of the concept of opposite, by Ro- correlated more highly with reading (0.59.to 0.66)
binowitz (108) and by Flamand (172), appear to offer than with WISC IQs (0.55 to 0.58). The correlation
greater promise than linear regression methods of Auditory Discrimination with WISC Verbal
for the evaluation of reading disability cases. The Scale IQ, the highest correlation reported, was
latter approach does not appear feasible with only 0.61.
Voc. and BD in the battery, but the pattern ap- Where hearing disabili~ is noted byaudiom-
proach, as discussed above, merits consideration. eter test it would be advantageous to estimate
In the Survey battery the WRAT is, of course, intelligence Ievel by a combination of Draw-A-
most directly related to estimation of reading dis- Man and Blcck Design scores.
ability, but a Voc.-BD ratio may be a useful sup-
plement. Visually Handicapped

Auditory Disability According to a study by Scholl (197), the


Block Design test may be administered with
Murphy (196) administered the WISC to an normal procedures to the paftially blind. For the
equally divided sample of 30U deaf boys and girls totally blind only the Vocabulary test would be
in English schools for the deaf. Deaf children did appropriate in the Survey, and no data are avail-
not differ significantly from normal children on able to evaiuate their scores adequately.
the Performance Scale in this study, and there was
no meaningful relation between hearing loss and
PS. It is of interest, though, that Block Desi~ Stutterers
correlated 0.71 with PS in this sample. In addition,
teacher ratings of emotional adjustment corre- Post (198) found no significant differences
lated 0.76 with PS, suggesting that here also, as between the mean scores of 30 stutterers and 30
in the samples evaluated in relation to the Chil- controls, predominantly boys in the age range of
drens Manifest Anxiety Scale, anxiety may be 5-5 to 15-10, on the Stanford-Binet (S-B) and the
a deterrent to effective performance. WISC. The correlation of WISC Full Scale IQ with
Graham and Shapiro (195) compared the per- the S-B was 0.78 for the stutterers. The only
formance of the deaf and normal children on the difference found between the two groups was in
WISC with standard and pantomime instructions. the correlation of WISC Verbal Scale and Performa-
Both groups did equally well on PS with pantomime nce Scalee IQs, which was 0.26 for the stutterers
instructions, but the normals were superior with and 0.60 (the same as in Wechslers standardiza-
standard instructions. Mean scores on BD were tion sample) for the controls. Both group means
approximately equal under all three conditions. were higher on PS than VS.

13
Cerebral Palsy The adequacy of the WISC for precise meas-
urement of the gifted may be questioned, but it
Bortner and Birch (199) studied the adminis-
is possible that more accurate measurement may
tration of the Block Design subtests with twenty-
be obtained by use of the present short form of
eight 13- year-old cerebral palsied children. They
Vocabulary and Block Design than with the Full
found, as may be expected, that the ability to dis-
Scale. This is a problem, however, that will re-
criminate block designs in a choice situation ma y
quire further attention.
be intact even though motor factors impair re-
productive ability. Mentally Retarded and Defective
Organic impairment of

Central Nervous System


The research on the use of the WISC with
retarded and defective groups is very favorable,
Beck and Lam (200) found that WISC Full in contrast with research on its use for the gifted.
Scale IQs of diagnosed organics were lower than This is indicated by virtually all the studies re-
those of nonorganic, but failed, as others have, to viewed: (a) reliabilities reportedThrone and
verify Wechslers subtest diagnostic pattern for others (227) obtained retest reliabilities over 3
organics. Young and Pitts (202) compared the to 4 months of 0.79 for Vocabulary and 0.82 for
WISC scores of 40 rural juvenile congenital Block Design on a sample of 39 retarded boys aged
syphilitics (aged 6 to 16 years) with 40 normal 11 to 14 years; (b) correlations of the WISC with
controls matched on age, sex, race, region, and other tests Stanford-Binet (216, 217, 228, and
fathers occupation. The controls were signifi- 229), Leiter International Performance Scale (221
cantly superior on IQs and on Vocabulary, but and 229), Wechsler Adult Intelligence Scale (222),
not on Block Design, where the critical ratio was Columbia Mental Maturity Scale (224 ), Goodenough
marginal. Draw-A-Man Test (224 ), Progressive Matrices
(233), Peabody Picture Vocabulary Test (234), and
Gifted
grade placement (238); (c) patterning studies,
In Edmonton, Chalmers (213) administered mentioned earlier; (d) absence of sex differences
the WISC to 57 superior children with IQs above (232); and (e) amenabili~ to short forms based on
120 (mean FS IQ 128) and found that 11 obtained Vocabulary and Block Design, as discussed above.
perfect scores on one or more tests. However, (See Research on Short Forms of the WISC.) Dif-
there were no perfect scores on Vocabulary and ferences between WISC and Stanford-Binet IQs
only one on Block Design. Nevertheless, Chalmers are smaller in this range than in any other. It
questioned the adequacy of the WISC ceilings for appears that estimates of retardation in the pop-
precise measurement in the very high range. ulation should be justified on the basis of a com-
Trauba (214), with a similar sample of 71 gifted posite score of Voc. and BD, but the desirability
Kansas children, found that WISC Vocabulary has of further research to develop a conversion table
a correlation of 0.71 with the McCall-C rabbs to the Full Scale should not be minimized.
Standard Test Lesson in Reading. Lucito and Gal-
lagher (215) obtained a mean WISC Full Scale IQ Bilingual
of 141 for a sample of 50 children whose mean
S-B IQ was 161. In this group the boys scores The effect of bilingualism appears to be in the
were slightly higher than those of the girls. In direction of lowering the Vocabulary scores; no
agreement with Gallagher and Lucito (164), men- effects have been reported on Block Design. Altus
tioned earlier, Similarities, Information, and Vo- (240) reported such results for Mexicans in Cali-
cabulary were the three highest tests for boys and fornia; Kralovjch (241), for children of Slavic
girls. Object Assembly, Coding, and Picture Ar- origin in New Jersey, and Levinson (243 and 244),
rangement were lowest for boys, while Digit Span, for Jewish children in New York. Kralovich re-
Picture Arrangement, and Picture Completion ported a correlation of 0.61 between the Verbal
were lowest for girls (only partially in agreement and Performance scales of the WISC for 28 mono-
with Gallagher and Lucite). lingual and -0.04 for 28 bilingual. Where bi -

14
lingualism is known to exist, verbal tests may be withstanding the fact that its popularity has been
expected to be invalid measures and greater re- somewhat reduced by the success of the relatively
liance on performance-type tests such as Block recent WISC. Although the standardization of the
Design and Draw-A-Man is indicated. WISC has been impressive and supported by so..,
phisticated conceptualization, many users have
Negro been relieved to find that it is highly correlated
with the Stanford-Binet. The correlation is in fact
The WISC norms do not apply to Negro chil- so high (accounting for over 80 percent of common
dren, and research by Young and Bright (251), variance) that one wonders about the significance
Caldwell (252), Blakemore (253), and Racheile of the theorizing which describes them so differ-
(254), as well as others, does nothing to alter ently.
this fact. Negroes score lower than whites, and The impression of similarity of measurement
it is generally accepted that cultural experience results given by the correlations does not, how-
and caste factors not onIy account for the Negro- ever, stand up when mean scores of different
white differences, but also render comparable groups are compared. As noted earlier, WISC
measurement by culture-fair or culture-free IQs tend to be lower than Stanford-Binet IQs at
methods as difficult as other ethnic comparisons. the lower age levels and among the gifted. These
The sampling designs of the studies cited, which observations are illustrated by glata extracted
used the WISC, were not adequate to qualify them from the following 12 studies in which comparison
for any detailed comment on differences found. means were cited 119, 120, 124, 147, 148, 151,
153, 156, 159, 161, 215, and 216. Their results
Socioeconomic Status are epitomized briefly on the following page.
Data from Jones (154) British study of 240 chil-
Laird (250) compared children of different dren in the age range 8 to 10 years are also of
socioeconomic status (SES) on the WISC and noted, interest. For this group the WISC means were,
in common with the general trend in the literature, on the average, 7.2 IQ points below the S-B, the
superior performance at upper levels. Estes (247 WISC always being administered first.
and 248) found similar differences at grade 2 but Allowing for sampling fluctuations and errors
not at grade 5. At both grades the WISC Full Scale of measurement in routine testing, there never-
IQ was more highly correlated with the Metro-
politan Achievement Test for the higher SES sam-
ple.

COMPARISON OF WISC
AND STANFORD-BINET IQS

Despite the theoretical objections to the men-


tal age concept, discussed earlier, which led to
the adoption of the deviation IQ as a distinctive
feature of the Wechsler scales and which set
them apart from the venerable Stanford-Binet Rehrded- Defecfive
test, the relation of the WISC to the S-B has been 4- 1
~
a matter of great interest, as evidenced by the
number of papers on this topic in the present re- AGE IN YEARS

view.
The Stanford-Binet is indeed one of the giants
among ps ycholcgical tests, a veritable landmark
Figure 1. Summary of the amount Stanford-Bi net
in the history of psychological measurement, and Intel 1 igence Test scores differ from Wechsl er
still enjoys extensive school and clinical use, not- Intel 1 igence Test scores.

15
Normal (White) Sa; pies

1tieanage 4-1 Mean S-B 104.3


1tean age 8-3 Mean WISC 98.9
1N 113 (61m, 62f) -=57

Triggs and Cartee (148) 1Kindergarten- Mean S-B 124.1


1kge 5 Mean WISC
1N 48 W5

Muhr (119) 5-year group Mean S-B ;;.;


N 21 Mean WISC
-9.3

6-year group Mean S-B 102,2


N 21 Mean WISC
w

Pastovic and Guthrie 5-year group Mean S-B :;;.;


(120) N 50 Mean WISC
7%

7-year group Mean S-B 115.1


N 50 Mean WISC 111.5
=3n

Rottersman (1S1) 6-year group Mean S-B 1110,2


N 50 Mean WISC 110105
=3Y7

Cohen and Collier (124) 6- to 9-year group Mean S-B IL,04.8


Ages 6-5 to 8-9 Mean WISC 99.8
N 53 -m

Wagner (156) 8- to 9-year group Mean S-B 104.5


N 50 Mean WISC 103.3
=c2

Frandsen and Higginson 9-year group Mean S-B :105,8


(159) N 50 Mean WISC
w

Kardos (161) 13- to 14-year group Mean S-B 113.7


N 100 Mean WISC 1094
=&l!

Gifted (White) Samples

Beeman (153) N 36 Full sample: Mean WISC compared with Mean S-B: -15
IQ over 130: Mean WISC compared with Mean S-B: -20
IQ 120.-129: Mean WISC compared with Uean S-B: ~

Lucito and Gallagher N 50 Mean S-B 160.8


MeanWISC 141.2
(215) m
Retarded Samples

Nale (216) 9- to n-year group Mean S-B 55.4


N 104 Mean WISC 58,0
+=

lIntervalbetween S-B and WISC administration, 50 months.


NOTE: N-number; m-male; f-female.

16
theless appears to be a common trend in these of reference of measurement theory and psy-
reports which can be summarized as follows. The chometric principles. The evidence considered
differences between WISC and S-B IQs are great- strongly supports the judgment of the Survey
est among the gifted. In the normal range they are staff in the selection of the WISC Vwabulary and
high among the very young, dropping off as age Block Design subtests as a short form of the WISC
increases, but persisting to some degree through- for the national survey, but at the same time it
out the age range 5 to 14 years. The data suggest raises questions concerning the acceptance of
an upturn after age 9, but this is not certain. No either the scaled scores of these subtests or of
significant differences appear for the subnormal. prorated Full Scale Intelligence Quotients based
The schematic chart in figure 1 suggests the na- on them without further empirical research. It
ture of the age- and level-related difference is the reviewers considered opinion that, given
functions on the basis of the results cited. the alternatives presented, the selection was an
Unfortunately it is possible only to speculate eminently wise one. The research recommended
on the nature of the true curves which those in reflects principal y the nature of the unprecedent-
figure 1 are intended to suggest, and speculation ed testing problems and the generally imprecise
on what they would be for a short form composed nature of psychological measurement.
only of Vocabulary and Block Design is difficult. The most important recommended investiga-
Some of the data presented earlier for these sub- tions discussed in this section involve the follow-
tests suggest that the differences might be small- ing steps:
er, but in the absence of empirical evidence this 1.- Restandardization
- - ofthe - -
Vocabulary and-
is only an educated guess. Block Design tests on the full Survey
For the purposes of the Survey there are sample. As part of this study, item diffi-
only two alternatives. One is to carry out some culties should be checked and a formula or
ad hoc research on the short form, as suggested set of formulas should be developed for
earlier, for the purpose of estimating the Full estimating Full Scale IQs from revised
Scale IQ from Voc. and BD, using the results to Voc. and BD scaled scores (based on
conform to Wechslers norms. The other is to samples of normal, gifted, and retarded
regard the full Survey sample as the unprecedented groups and if possible several ethnic
opportunity to carry out a complete new standardi- groups, such as Negroes or Mexicansto
zation of the short form on a basis that, in sam- whom the Full Scale has been adminis-
pling sophistication, far exceeds any work of its tered). Consideration should be given
kind in the history of testing. There area number to estimation of IQs directly from raw
of problems related to the second alternative, scores by age group.
including the availability of funds for this purpose. 2. Research on correlates of a VOC.-BD
However, if this standardization were accom- ratio, for use with the WRAT and with the
plished, the new norms for Voc. and BD would be Draw -A-Man Test in the identification of
superior to those now available, and the compu- poor readers, bilingual, and verbally im-
tations of FS IQ based on them would permit more paired children and in estimating IQs of
accurate population estimates than any others culturally deviant ethnic groups.
conceivable for the age range included. 3. Cross-disciplinary developmental anal-
yses of Vocabulary, Block Design, and de-
SUMMARY AND CONCLUSIONS rived scores and of item responses with
biomedical data obtained in other sections
This review is based on 154 published studies, of the Survey. This area is discussed in
reviews, and unpublished theses and disserta- detail elsewhere. See Klausmeier and
tions related to the WISC, interpreted in a frame Check (166).

17
BIBLIOGRAPHY

General References to WISC 118. Kureth, G., Muhr, J. P., and Weisgerber, C. A.: Some data
on the validity of tbe Wechsler Intelligence Scale for
101. Wechsler, D.: Wech.sler Intelligence Scale for Chikiren.
Children. Child Development 23:281-287, 1952.
New York. Psychological Corp., 1949.
119. Muhr, J. P.: Validity of the Wechsler Intelligence Scale
102. Littell, W. M.: The Wechsler Intelligence Scale for Chil- for Children at the Five and Six Yea? Level. Unpub-
dren, review of a decade of research. Psychological lished masters thesis, University of Detroit, 1952.
Buzz. 57:132-156, 1960. 120. Pastovic, J. J., and Guthrie, G. M.: Some evidence on
103. McCandless, B. R.: Review of the WISC, in O.K. Buros, tbe validity of the WISC. J. Consult. Peycho2. 15:385-
ed., Fourth Mental Measurements Yearbook. Highland 386, 1951.
Park, N.J. The Gryphon Press, 1953. pp. 480-481. 121. Pastovic, J. J.: A Validation Study of the Wechsler In-
104. Frost, B. P.: An application of the method of extreme telligence Scale for Children at the Lower Age Level.
deviations to the Wechsler Intelligence Scale for Chil- Unpublished masters thesis, Pennsylvania State Col-
dren. J. Clin.PsychoZ. 16:420, 1960. lege, 1$51.
105. Winpenny, N.: An Investigation of the Use and the Va- 122. Altus, G. T.: A note on the validity of the Wechsler In-
lidity of Mental Age Scores on the Wechsler intelligence telligence Scale for Children. J. ConsuE.PeychoZ. 16:
Scale fo? Children. Unpublished masters thesis, Penn- 231, 1952.
sylvania State College, 1951.
Relations with Other Tests: Batteries
106. Maxwell, A. E.: Inadequate reporting of nonnative test
data. J. Clin.Psychol. 17:99-101, 1961. 123. McBrearty, J. F.: Comparison of the W[SC With the Arthw
107. Seaehore, H. G.: Differences between verbal and per- Performance Scale, Form 1, and Their Relationship to
formance IQs on the Wechsler Intelligence Scale for the Progressive Achievement Test. Unpublished mas-
Children. J. ConsrM.Psychol. 15:62-67, 1951. ters thesis, Pennsylvania State CoIlege, 1951.

108. Robinowitz, R.: Learning the relation of opposition as 124. Cohen, B. D., and Collier, M. J.: A note on WISC and
related to scores on the Wechsler Intelligence Scale for other tests of children six to eight years old. J. Consult.
Children. J. Genet.Psycho2. 88:25-30, 1956. Pwychol. 16:226-227, 1952.
125. Estes, B. W., Curtin, M. E., DeBurger, R. A., and Denny,
Factor Analytic Studies C.: Relationships between 1960 Stanford-Binet, 1937
Stimrford-Binet, WISC, Raven, and Draw-A-h4rin. J. Con-
109. Hagen, E. P.: A Factor Analysis of the Wechsler Intel-
su2t.Psycho2, 25:388-391, 1961.
ligence Scale for Children. Unpublished doctoral dis-
sertation, Columbia University, 1952. 126. Smith, B. S.: The relative merits of certain verbal and
non-verbal tests at the second-grade level. J. C2in.Psy-
110. Gault, U.: Factorial patterns on the Wecheler Intelligence
C/LOt. 17:53-54, 1961.
Scales. Aust.J.Psychol. 6:85-90, 1954.
111. Cohen, J.: The factorial structure of the WISC at ages Relation: with Other Tests: Wechsler-Bellevue
7-6, 10-6, and 13-6; J. Consult.Psychol. 23:285-299,
1959. 127. Bacon, C. S.: A Comparative Study of the WechsZer-Bel-
levue intelligence Scale for Adolescents and Adults,
Reliability and Stability Form I, and the Wechsler Intelligence Scale for Children
at the Twelve-Year Level. Unpublished masters thesis,
112. Hite, L.: Analysis of Reliability and Validity of the
University of North Dakota, 1954.
Wechsler Intelligence Scale for Chitdren. Unpublished
128. Delattre, L., and Cole, D.: A comparison of tbe WISC
doctoral dissertation, Western Reserve University, 1953.
and the Wecbsler-Bellevue. J. Consult. Psycho7. 16:228-
113. Gehman, L H., and Matyas, R. p.: Stability of the WISC 230, 1952.
and Binet tests. J. Coneutt. Psycho2. 20:150-152, 1956.
114. Matyas, 1?. P. :A Longitudinal Study o f the Revised Stan- Relations with Other Tests: Bender-Gestalt Perceptual Tests
ford-Binet and the W[SC. Unpublished masters thesis,
129. Koppitz, E. M.: Relationships between the Bender-Ge-
Pennsylvania State University, 1954.
stalt Test and the Wechsler Intelligence Scale for Chil-
115. Reger, R.: Repeated measurements with the WISC. Psy - dren. J. Clin.Psychol. 14:413-416, 1958.
chol.Rep. 11:418, 1962.
116. Whatley, R. G., and plant, W. T.: The stabilitY f lSC 130. Armstrong, R. G., and Hauck, P. A.: Correlates of the
IQs fo; selected children. J. Psychol. 44:165-167,1957. Bender-Gestalt scores in children. J. PsychoZ.Stud. 11:
153-158, 1960.
Validity
131. Goodenough, D. R., and Karp, S. A.: Field dependence
117. Mussen, P., Dean, S., and Roeenberg, M.: Some further and intellectual functioning. J. Abnorm.&Socia2 Psycho2.
evidence on the validity of the WISC. J. Consult .Psycho2. 63:241-246, 1961.
16:410-411, 1952.

18
Relations with Other Tests: CMAS 145. Harlow, J. E., Jr., Price, A. C., Tatham, L. J., and
Davidson, J. F.: Preliminary study of comparison be-
132. Hafner, A. J., Pollie, f!. M., and Wapner, I.: The relation- tween Wecbsler Intelligence Scale for Children and Form
ship between the CMAS and WISC functioning. J. Clin. L of the Revised Stanfoo3 Binet Scale ak three age lev-
Psychol. 16:322-323, 1960. els. J. CZin.PsychoZ. 13:72-73, 1957.
Relations with Other Tests: 146. Boruszak, R. J.: A Comparative Study to Determine the
Ammons Full Range Picture Vocabulary Correlation Between the IQs of the Revised Stanford
Binet Scale, Form L, and the IQs of the Wechsler in-
133. Smith, L. M., and Fillmore, A. R.: The Ammone FRPV telligence Scale for Chitdren. Unpublished mastere the-
Test and the WISCfor remedial reading cases; abstracted, sis, Wisconsin State College, 1954.
J. Consult.Psychol. 18:332, 1954. 147. Schacbter, F. F., and Apgar, V.: Comparison of pre-
school Stanford-Binet and school-age WISC IQs. J. Educ.
Relations with Other Tests: CTMM
PsychoL 49:320-323, 1958.
134. Altus, G. T.: Relationships between verbal and non-ver- 14+ Triggs, F. O., and Cartee, J. K.: Pre-school pupil per-
bal parts of the CTMM and WISC. J. Consult .PsychoZ. formance on tbe Stanford-Binet and the Wechsler Intel-
19:143-144, 1955. ligence Scale for ChiIdren. J. CZin.PsychoZ. 9:27-29,
1953.
Relations with Other Tests: Kent EGY
149. Holland, *G.A.: A comparison of the WISC and Stanford-
135. Delp, H. A.: Correlations between the Kent EGY and the Binet IQs of nornrsf children. J. Consrdt.Psychol. 17:
Wechsler batteries. J. C?in.Psycho2. 9:73-75, 1953. 147-152, 1953.
150. Weider, A., Noller, P. A., and Schraumm, T. A.: The
Relations with Other Tests: Peabody Picture Vocabulary Test
Wechsler Intelligence Scale for Children and tbe Re-
136. Kimbrell, D. L.: Comparison of Peabody, WISC, and ac- vised Stanford-Binet. J. ConsuZt.Psychot. 15330-333,
ademic achievement scores among educable mental de- 1951.
fective. PsychoZ.Rep. 7:502, 1960. 151. Rottersman, L.: A Comparison of the IQ Scores on the
137. Himelstein, P., and Herndon, J. D.: Comparison of the New Revised Stanford Binet, Form L, the Wechsler In-
WISC and Peabody Picture Vocabulary Test with emo- telligence Scale for Children, and the Goodenough Draw
tionally disturbed children. J. Clin.PsychoZ. 18:82, 1962. A Man Test at the Six Year Age Level. Unpubliebed
masters thesis, University of Nebraska, 1950.
Relations with Other Tests: Raven Progressive Matrices 152. Tatbam, L. J.: Statistical Comparison of the Revised
Stanford-B;net Intelligence Test Form L With the Wech-
138. BarratG E. S.: The relationship of the Progressive Ma-
trices (1938) and the Columbia Mental Maturity Scale to sler Intelligence Scale for Children Using the Six and
One-Half YearLevel. Unpublished masters thesis, Uni-
the WISC. J. Consrdt.PsychoL 20:294-296, 1956.
versity of Florida, 1952.
139. Wilson, L.: A Comparison of the Raven t-?ogressive Ma- 153. Beeman, G.: A comparative study of the WISC and Stan-
$ricee (19.J7)and the Performance Scale of the Wechsler
ford-Binet with a group of more able and gifted 7-11 year
intelligence Sca.Ze for Children ~or Assessing the InteZ-
old students. Calif.J.Educ.Res. 11:77, 1960.
Zigence of Indian Children. Unpublished masters thesis,
University of British Columbia, 1952. 154. Jones, S.: The Wechsler Intelligence Scale for Children
applied to a sample of London primary school children.
140. Malpass, L. F., Brown, R., and Hade, D.: The utility of
Br.J.Educ.PsychoL 32(2):119-133, 1962.
the Progressive Matrices (1956 edition) with normal and
retarded children. J. CZin.PsychoL 16:350, 1960. 155. Scott, G. R.: A Comparison Between the WechsZer In.teZ-
141. Stacey, C. L., and Carleton, F. O.: Tbe relationship be- Zigence Scale for Children and the RevisedStanford-Binet
Scales. Unpublished masters thesis, southern Method-
tween Ravens Colored Progressive Matrices and two
tests of general intelligence. J. Clin.PsychoZ. 11:84-85, ist University, 1950.
1955.
156. Wagner, W. K.: A Comparison of Stanford-Binet Mental
142. Martin, A. W., and Wiechers, J. E.: Ravens Colored Pro-
Ages and Scaled Scores on the Wechder Intelligence
gressive Matrices and the Wechsler Intelligence Scale
Scale for Children for Fifty Bowling Green Pupils. Un-
for Children. J. ConsuZt.PsychoZ. 18:143-144, 1954.
published masters thesis, Bowling Green State Univer-
Relations with Other Tests: SR.4-PMA sity, 1951.
157. Stanley, J. C.: Statistical analysis of scores fmm coun-
143. Stempel, E. F.: The WISC and the SRA Primary Mental terbalanced tests. J..Ezp.Educ. 23:187-207, 1955.
Abilities Test. Child Development 2*257-261, 1953.
158. Arnold, F. C., and Wagner, W.K.: A comparison of Wech-
Relations with Other Tests: Stanford-Binet sler Childrens Scale and Stan ford-Binet scores for eighb
and nine-year-olds. J. Ezp.Educ. 2491-94, 1955.
144. Krugman, J. I., Justman, J., Wrightstone, J. W., and Krug-
man, M.: Pupil functioning on the Stanford- Binet and the 159. Frandsen, A. N-, and Higginson, J. B.: The Stanford-
fihet and the Wechsler Intelligence Scale for Children.
Wechsler Intelligence Scale for Children. J. Consult.
J. ConsuZt.Psychol. 15:236-238, 1951.
Psychol. 15:475-483, 1951.

19
160. Clarke, F. R.: A Comparative Study of the Wechsier [r- 172. Flamand, R. K.: The Relationship l?etween Vatious Meas-
itelligence Scale for Children and the Revised Stanford ure 3 of Vocabulary and Performance in Beginning Read-
Br%et Intelligence Scale, Form L, in Reiation to the Scho- ing. Unpublished doctoral dissertation, Temple Univer-
lastic Achievement of a 5th Grade Population. Unpub- sity, 1961.
lished masters thesis, Pennsylvania State College, 173. Triggs, F. O., Cartee, J. K., Binks, V., Foster, D., and
1950. Adams, N. A.: The relationship between specific reading
161. Kardos, M. S.: A Comparative Study of the Performance skills and general ability at the elementary and junior-
of Twelve-Year-Old Children on the WISC and the Re- senior high echool levels. Eduo. Psycho2.Measur. 14:
vised Stan ford- Binet, Form L, and the Relationship of 176-185, 1954.
Both to the California Achievement Tests. Unpublished 174. Fitzgerald, L. A.: Some Effects of Reading Ability on
masters thesis, Marywood College, 1954. GrorI.p Intelligence Test Scores in the Intermediate
162. Davidson, J. F.: A Preliminary Study in Statistical Com- Grades. Unpublished doctoral dissertation, State Uni-
parison of the Revised Stan ford-Binet Intelligence Test versity of Iowa, 1960; abstracted, Diss.dbstr. 21:1844,
Form L With the Wechsler Intelligence Scale for Chil- 1961.
dren Using the Fourteen Year Level. Unpublished mas-
WISC: Short Forrne
ters thesis, University of Florida, 1954.
175. Armstrong, R. G.: A reliability study of a short form of
Relatione with Other Tests: Wartegg Drawing Completion Test
the WISC vocabulary subtest. J. C2in.Psychol. 11:413-
163. Stark, R.: A Comparison of Intelligence Test Scores on 414, 1955.
the Wechsler Intelligence Scale for Children and the War- 176. Throne, J. M.: A &ho?t Form of the Wechsler-Bellevue
tegg Drawing Completion Test with School Achievement Intelligence Test for Children. Unpublished masters
of Elementary School Children. Unpublished masters thesis, University of Florida, 1951.
thesis, University of Detroit, 1954. 177. Slmpeon, W. H., and Bridges, C. C., Jr.: A short form of
WISC: Reeponse Patterns of Cifted, Average, and Retarded the Wechsler Intelligence Scale for Children. J. C2in.Psy-
chol. 15:424, 1959.
1U4. Gallagher, J. J., and Lucite, L. L.: Intellectual patterns Carleton, F. O., and Stacey, C. L. : Evaluation of se-
178.
of gifted compared with average, and retarded. lizcept. lected short forms of the Wechsler Intelligence Scale for
Zhildren 27:479-482, 1961.
Children. J. C2in.Peychot. 10:258-261, 1954.
165. Klausmeier, H. J., and Feldhusen, J. F.: Retention in Yalowitz, J. M., and Armstrong, R. G.: Validity of ehort
179.
arithmetic among children of low, average, and high in- forms of the Wechsler Intelligence Scale for Children
telligence at 117 months of age. J. Educ.Psychol. 50: (WISC). J. Clin.Psychol. 11:275-277, 1955.
88-92, 1959.
166. Klausmeier, H. J., and Check, J.: Relationships among WISC: Reading Disability
physical, mental, achievement, and personality measuree 180. Kallos, G. L., Grahow, J. M., and Guarino, E. A.: The
in children of low, average, and high intelligence at 113 WISC profile of disabled readers. Personnel Guid.J.
months of age. Am. J.Ment .Deficiency 63:1059-1068, 39:476-478, 1961.
1959.
181, Altus, G. T.: A WISC prof ile for retarded readers. J. Con-
167. Feldhusen, J. F., and Klausmeier, H. J.: Anxiety, intel- suit.Psychot. 20:155-156, 1956.
.Iigence, and achievement in children of low, average,
182. Sheldon, hi. S., and Garton, J.: A note on a WISC pro-
and high intelligence. Chitd Development 33:403-409,
file for retarded readers. Alberta J. Educ. Res. 5:26+
1962.
267, 1959.
183. Robeck, M. C.: Subtest patterning of problem readers on
WISC: Vocabulary, Language Skills, Reading
\WSC. Ca2if.J.&duc.Res. 11:110-115, 1960.
168. Stacey, C. L., and Portnoy, B.: A study of the differen- 184. Abrams, J. C.: A Study of Certain Personality Character-
tial responses on the vocabulary subtest of tbe Wechsler istics of Non-Readers and Achieving Readers. Unpub-
Intelligence Scale for Children. J. C2in.Psychol. 6:401- lished doctoral dissertation, Temple University, 1955.
403, 1950. 185. Karlsen, B.: A Comparison of Some Educational and Psy -
169. Winitz, H.: A Comparative Study of Certain Language chological Characteristics of Successful and Unsuccess-
Skills in Male and Female Kindergarten Children. Un- ful Readers at the Elementary School Level. Unpub-
published doctoral dissertation, State University of Iowa, lished doctaral dissertation, University of Minnesota,
1959. 1954.
170. Dunsdon, M. I., and Roberts, J. A. F.: A study of the 186. Burks, H. F., and Bruce, P.: The characteristics of poor
performance of 2,000 children on four vocabulary tests. and good readers as discloeed by the Wechsler Irrtelli-
Br.J.Statist.Psy chol. 8:3-15, 1955. gence Scale for Children. J. Educ.Psychot. 46:488-493,
171. Reidy, M. E. z A Validity Study of the Wechsler-Bellevue 1955.
intelligence Scale for Children and Its Relationship to 187. Ro~e, H. J.: A Study of the Relationships of Reading
Reading and Arithmetic. Unpublished masters thesis, Achievement to Certain Other Factors in a Population
Catholic University of America, 1952. of Delinquent Boys. Unpublished doctoral dissertation,
University of Minnesot~ 1959.

20
WZSC:Schcml Achievement 203. Rowley, V. N.: Analysis of the WISC performance of
brain damaged and emotionally disturbed children. J. Con-
188. Orr, K. N.: The %eohater Intelligence Scale for Chikiren
8ukP8ychcl. 25:553, 1961.
aa a Predictor of SchooZ Succeee. Unpublished masters
thesis, Indiana State Teachers College, 1950. WISC: Personality hfeasures (Normal), Discipline, Delinquency
189. Schwitzgoebel, R. R.: The Predictive Value of Some Re-
Zationahipe Between the Wecii8Zer inteZZigence Scale for
204. Gourevitch, V., and Feffer, M. H.: A study of motivational
development. J. GeneLPeychoL 100:361-375, 1962.
ChiZdren and.4cadernic Achievement in Fifth Grade. Un-
published doctoral dissertation, University of Wisconsin, 205. Carrier, -N. A., Orton, K. D., and hfalpnss, L. F.: Re-
1952. sponses of brigbt, normal, and EMH children to an orally-
administared manifest anxiety scale. J.Educ.P8ychoL
190. Bsrratt, E. S., and Baumgarten, D. L.: The relationship
of the WZSCand Stsnford-Binet to school achievement. 53:271-274, 1962.
J. Consult. Psychol. 21:144, 1957. 206. Bums, L.: A Correlation of Scores on the Wech8Zer In-
telligence Scale for Children and the California Test of
191. Raleigh, W.H.: A Study of the Rektionehips of Academic
Personality Obtained by a Group of 5th Gradere. Unpub-
Achievement in Sixth Grade With the Wechsler [intell-
lished masters thesis, Pennsylvania State College,
igence Scale for Children and Other Variable8. Unpub-
lished doctoral dissertation, Indiana University, 1952. 1954.
207. Kent, N., and Davis, D. R.: Discipline in the home and
192. Stroud, J. B., Blommers, P., and Lauber, M.: Correlation
of WZSCand achievement tests. J. Educ.P8ychoL 48:
intellectual development. Bnt.J.M .PsychoZ. 30:27-33,
1957.
18-26, 1957.
208. Wallj H. R.: A Differential Analy8i8 of Some [nteZzective
WTSC:Auditory Dkability, Visual Eandi cap, and Affective Chamcteri8tice of Peer Accepted and Re-
Stuttering, Cerebral Palsy, Brain Damage jected Pre-Adolescent Children. Unpublished doctoral
dissertation, University of Kansas, 1960.
193. Thompson, B. B.: The ReZation cf Auditoy Discrimina-
tion and Inte~~igenCe Te8t SCOre8 tO SUCCe88 in Prima?g 209. Walker, H. A.: The Wechazer Intelligence Scale for r?hil-
Reading. Unpublished doctoral dissertation, Indiana Uni- dren as a Diagnostic Device. Unpublished masters the-
versity, 1961. sis, Utah State Agricultural College, 1956.
194. Glowatskyj E.: Tbe verbal element in the intelligence 210. Schonbcm, R.: A comparative study of the differences
scores of congenitally deaf and hard of hearing children. between adolescent and child male enuretics and non-
Amer.Ann.Deaf 98:328-335, 1953. enuretics as shown by an intelligence test. Psychol.
Newsletter 6:1-9, 1954.
195. Grabsm, E. E., and Shapiro, E.: Use of tbe Performance
Scale of the Wechsler Intelligence Scale for Children 211. Maxwell, A. E.: Discrepancies in the variances of test
with the deaf child. J. Con8ukP8ychoz. 17:396-398, results for normal and neurotic children. Br.J.Statis$.
1953. P8ychoz. 13:165-172, 1960.

196. Murphy,L. J.: Tests of abilities and attainments, pupils 212. Richardson, H. M., and Surko, E. F.: WISC scores and
in schools for the deaf aged six to ten, in A. W.G. Ewing, status in reading and arithme~c of delinquent children.
cd., Educational Guidance and the Deaf Child. hkn- J. Genet.Psychol. 89:251-262, 1956.
chester, England. Manchester University Press, 1957.
WISC: Gifted
pp. 213-251.
197. Scholl, G.: Intelligence tests for visually handicapped 213. Chalmers, J. M.: An AnaZysie of Results Obtmned on
children. Iixcep. Children 20:116-120, 1953.. the Wecheler Intelligence Scale fcr Children by Mentally
198. Post, D. P.: A Comparative Study of the Revi8ed Stan- Superior Subjects. Unpublished masters thesis, Uni-
ford Binet and the Wech8zer Intelligence Scale for Chil- versity of Alberta, 1953.
dren Administered to a Group of Thirty Stutterer8. Un- 214. Trauba R. G.: A Study of the A8pects of Differentiation
published masters thesis, University of Soutbem Cali- of Abilities in Interpretation cf Reading With a Group
fornia, 1952. of Gifted Children. Unpublished doctoral dissertation,
199. Bortner, M., and Birch, H. G.: Perceptual and perceptual- University of Kansas, 1959.
motor dissociation in cerebral palsied children. J. Nerv. 215. Lucite, L., and GalIagher, J.: Intellectual patterns of
fi~erbLDi8. 134103-108, 1962. highly gifted children on the WZSC. Peabcdy J.Educ.
200. Beck, H. S., and Lam, R. L.: Use of tbe WISC in pre- 38:131-136, 1960.
dicting organicity. J. Clin.Peychol. 11:154157, 1955.
WZSC: Mental Defective
201. Kilman, E!. A., and Fisher, G. M.: An evaluation of tbe
Finley-?bompson abbreviated form of the WISC for un- 216. Nale, S.: The Childrens-Wechsler ana the Binet on 104
differentiated, brain damaged and functional retardates. mental defective at the Polk State School. Am.J.hfent.
Am. J. Went. Deficiency 64:742-746, 1960. Deficiency 56:419-423, 1951.
202. Young, F. M., and Pitts, V. A.: Tbe performance of con- 217. Sloan, W., and Schneider, B.: A study of the Wechsler
genital syphilitics on the Wechsler Intelligence Scale Intelligence Scale for Children with mental defective.
for Children. J. Ccnsuzt.P8ychcz. 15:239-249, 1951. Am.J.hfent.Deficiency 55:573-575, 1951.

21
218. Atchison, C. O.: Use of the Wechsler-Intelligence Scale 234. Duntr, L. M., and Brooks, S. T.: Peabody Picture Vocab-
for Children with eighty mentally defective Negru chil- ulary Test perforrnan ce of educable mentally retarded
dren. Am.J.Ment,Deficiency 60:378-379, 1955. children. Train.Sch. BrdZ. 57:35-40, 1960.
219. Carleton, F. O., and Stacey, C. L.: An item analysis of 235. Schwartz, L., and Levitt, E.: Short forms of the Wechsler
the Wechsler Intelligence Scale for Children. J. Clin. Intelligence Scale for Children in the educable, non-in-
Psychol. 11:149-154, 1955. stitutionalized mentally retarded. J. Educ.Psyciao7. 51:
220. Newman, J. R., and Loos, F. M.: Differences betweeti 187-190, 1960.
verbal and performance IQs with mentally defective chil- 236. Salvati, S, I?.: A Comparison of W7SCIQs and Altitude
dren on the Wechsler Intelligence Scale for Children. Scores as Predictors of Learning A bility of Mentally Re-
J. Consult. Psychol. 19:16, 1955. tardedSubjecte. Unpublished doctoral dissertation, New
221. Alper, A. E.: A comparison of the WISC and the Arthur York University, 1960; abstracted, Diss.Abw%. 21:2370,
adaptation of the Leiter International Performance Scale 1961.
with mental defective. Am.J.Ment.Deficiency 63:312- 237. ~aumeister, A. A.: The Dimensions of Abilities in Re-
316, 1958. tardates as Measured by the Wechsler Intelligence Scale
222. Fleming, J. W.: The Relationships Among Psychometric, for Children. Unpublished doctoral dissertation, George
Experimental, and Observational Measures of Learaing Peabody College for Teachers, 1961.
Ability in [nstitutionatized Endogenous Mentatty Re- 238. Thompson, J. M., and Finley, C. J.: The validation of
tarded Persons. Unpublished doctoral dissertation, Uni- an abbreviated Wechsler [intelligence Scale for Children
versity of Colorado, 1959. for use with the educable mentally retarded. Edttc.Psy-
223. Baroff, G. S.: WISC patterning in endogenous mental de- chol.Measw. 22:539-542, 1962.
ficiency. Am. J. Ment.Deficiency 6k48Z485, 1959. 239. Oeborne, R. T., and Allen, J.: Validity of short forms
224. Warren, S. A., and Collier, H. L.: Suitability of the C& of the WISC for mental retardates. PsychoZ.Rep. 11:167-
lumbia Mental Maturity Scale for mentally retarded in- 170, 1962.
stitutionalized females. Am .J.Ment.Deficiency 84:916-
WISC: Bilingualism
920, 1960.
225. Fisher, G. M.: A cross-validation of Baroffs WfSC pab 240. Altus, G. T.: WISC patterns of a selective sample of bi-
terning in endogenous mental deficiency. Am. J.Ment.De- lingual school children. J. Genet.PsychoJ. 83:241-248.
ficiency 65:349-350, 1960. 1953.
226. Baumeister, A., and Bartlett, C. J.: Further factorial in- 241. Kralovich, A. M.: The Effect of Bilingualism on intelli-
vestigations of WISC performance of mental defective. gence Test Scores as Measured by the Wechsler lnteZli-
Ara.J.Ment.Deficiency 67:257-261, 1962. gence Scale for Children. Unpublished masters thesis,
227. Fordham University, 1954.
Throne, F. M., Schulman, J. L., and Kasper, J. C.: Re-
liability and stability of the Wechsler Intelligence Scale 242. Cooper, J. G.: Predicting school achievement for bilin-
for Children for a group of mentally retarded boys. Am. gual pupils. J. Educ.Psychol. 49:31-36, 1958.
J. Ment.Deficiency 67:455-457, 1962. 243. Levinson, B. M.: A comparison of the performance of bi-
lingual and monolingual native born Jewieh preschool
WISC: Mentally Retarded
chl Idren of traditional parentage on four intelligence
228. Stacey, C. L., and Levin, J.: Correlation analysis of tests. J. Clin.Psychol. 15:74-76, 1959.
scores of subnormal subjecte on the Stanford-Binet and 244. Levinson, B. M.: A comparative study of the verbal and
Wechsler Intelligence Scale for Children. Am .J.Ment. performance ability of monolingual and bilingual native
Deficiency 55:590-597, 1951. born Jewish preschool children of traditional parentage.
229. Sharp, H. C.: A comparison of slow learners scores on J. Genet.PsychoZ. 97:93-112, 1960.
three individual intelligence scales. J. Ctin.Psychot.
WISC: Cultural Variations
13:3(2-374, 1957.
230. Matthews, C. G.: Differential Performances of Non- 245. Levinson, B. M.: Traditional Jewish cuItural values and
Achieving Children on the Wechsler Intelligence Scale. performance on the Wechsler tests. J. Educ.Psychol.
Unpublished doctoral dissertation, Purdue University, 50:177-181, 1959.
1958. 246. Levinson, B. M.: Subcultural variations in verbal and
231. Finley, C. J., and Thompson, J.: An abbreviated Wech- performance ability at the elementary school level. J.
sler InteHigence Scale for Children for use with educable Genet.Psychet. 97:149-160, 1960.
mentally retarded. Am. J. Ment.Deficiency 63:473-480,
1958. WISC: Socioeconomic Status

232. Finley, C., and Thompson, J.: Sex differences in intel- 247. Estes, B. W.: Influence of socioeconomic status on Wech-
ligence of educable mentally retarded children. Ca7if. sler Intelligence Scale for Children, an explora~ry study.
J. Educ.Res. 10:167-170, 1959. J. Consrdt.Psychol. 17:58-62, 1953.
Brown, R., Hakes, D., and Malpass, L.: The utility of 248. Estes, B. W.: Influence of socioeconomic status on Wech-
the Progressive Matrices Test(1956 revision); abstracb sler Intelligence Scale for Children, addendum. J. Con-
ed, Am. Psychologist 14:341, 1959. suit.Psychot. 19:225-226, 1955.

22
249. Roy, I., and Cohen, N.: Some psychometric variables 252. CaldweII, M. B.: An AnaZysis of Responses of a South-
relative to change in sociometric status; abstracted, ern Urban Negro Population to Items on the Wech8Zer
Am. Psychologist 10:328, 1955. InteZZigence ScaZe for ChiZdren. Unpublished doctoral
250. Laird, D. S.: The performance of two groups of eleven- dissertation, Pennsylvania State University, 1954.
year-old boys on the Wechsler Intelligence Scale for 253. Blakemore, J. R.: A Comparison of Scores of Negro and
Children. J. Educ.Res. 51:101-107, 1957. White Children on the Wechsler Intelligence ScaZe for
ChiZdren. Unpublished masters thesis, College of the
WfSC: Negro Samples, Negro-White Comparisons Pacific, 1952.
251. Young, F. M., and Bright, H. H.: Results of testing 81 254. Racheile, L. D.: A Comparative AnaJysis of Ten Year OZd
Negro rural juveniles with the Wechsler Intelligence Negro and White Performance on the Wech.sZerInteZZigence
Scale for Children. J.Soc.Psychol. 39:219-226, 1954. ScaZe for ChiZdren. Unpublished doctoral dissertation,
University of Denver, 1953.

Il. THE WIDE RANGE ACHIEVE MENT TEST,


THE ORAL READING AND ARITHMETIC SUBTESTS

The requirement ot the Survey for an indi- itself extremely limited. Appropriate data for
vidually administered, brief, well-standardized, critical evaluation of the 1963 edition are almost
reliable, valid, and flexible school achievement totally lacking. Although released for sale in 1963,
test was filled by the selection of the Reading the test manual for this edition was still incom-
and Arithmetic subtests of the 1963 revision of plete in June 1964 (301), and no independent data
the Wide Range Achievement Test. The 1963 on validity have been found.
WRAT, by J.F. Jastak, replaces the original 1946
edition by Jastak and S. W. Bijou and appears to EVALUATIVE CRITERIA
be quite similar to the original in design and item
content, except that the new edition is divided, for Measurement experts believe that in addi-
the convenience of users, into two levels (Level I tion to the standard questions concerning such
covers ages 5 to 12 years; Level II, 12 years issues as reliability, validity, representativeness
through adulthood), in contrast with the broad of standardization sample, and agreement of
sweep of the original, from kindergarten through norms with criterion levels, some problems are
adulthood. inherent in the wide-range type of design. These
The principal difference between the two edi- are stated forthrightly by Chauncey and Dobbin
tions appears to be in the method of standardi- (310), in a discussion of various defects of tests:
zation. The 1946 norms were computed to conform
The wide-range test . . . is the too-short
to those of the New Stanford Achievement, Test
test in disguise. There are only a few of them
(Reading, to New Stanford Word and paragraph
around. They are promoted as being suitable
Reading, and Arithmetic Computation, to New
measures of ability (or achievement) for
Stanford Arithmetic Computation), whereas the
people of many agesfrom third grade
1963 norms, in each age bracket, depend on
through second year of college, for example.
probability samplings based on IQs . . . that
Since only a small part of any such test can be
would correspond to the achievement of mentally
material suitable in difficulty for one indi-
average groups with representative dispersions
vidual, the effective part of the test may
of scores above and below the mean (301).
amount to no more than half a dozen ques-
The purpose of this section is both to review
tions-making it a very short test, indeed.
the literature on the WRAT and to evaluate it in
relation to its suitab] Iity for the objectives of the These remarks, by the president and one of
Survey. Unfortunately this must be done almost the project directors of the Educational Testing
entirely on the basis of the tests, manuals, and Service, in a book written expressly to defend
research available on the 1946 edition, which is educational testing at a time when it is under

23
attack from man y sources, command attention 2. Reliable (and valid) school tests should be
and concern by users of wide-range tests such used to assess discrepancies between in-
as the WRAT. The particular implication of the tellectual capacity and performance in
critique is that reliabilities, validities, and score basic school subjects as well as dis-
levels must be evaluated at every level covered crepancies in the organization of learning
(or at least at every level at which the test is abilities. Wide range discrepancies in
used) and that broad-band coefficients of relia- school achievement are the rule rather
bility and concurrent validity are likely to be than the exception, and their discovery is
misleading. important for the understanding of per-
The problem of selecting a suitable achieve- sonality and scihool performance problems
ment test for the Survey is highly complex. Time and for the institution of proper remedial
restrictions favof short forms and short-cut programs.
methods (such as the wide-range approach), pro- 3. Clinically recognized discrepancy pat-
vided that they meet reasonable standards of terns in children are illustrated by the
acceptability. However, it is just as true in test- tendency of neurotic and disorganized
ing as in all other areas that you cannot get children to be more proficient in reading
more out than you put in. Compromises with than in arithmetic. In addition, if neu-
reality in testing often mean less reliable meas- rotic tendencies and special reading
ures and less adequate coverage of appropriate handicaps occur together the child may
universes of conten~ sometimes they mean penal- function far below the level of his true
ties in relation to validity and consequent gener- capacity in all school subjects. Of course,
alizability of measures. failure in reading and in arithmetic may
The application of these points to the WRAT also reflect unrelated processes.
is considered as judicially as possible in this re-
Jastaks criteria of a satisfactory school
view, and the reality demands are weighed against
achievement test for (individual) clinical use are
possible shortcomings of this wide-range test in
(a) low cost, (b) individual standardization, (c)
relation to alternatives available in the situation.
ease and economy of administration, (d) suita-
A brief review of the 1946 edition and the general
bility of contents, (e) relevance of the functions
conceptualization of the WRAT is followed by a
studied, and (f) comparability of results OVIWthe
review of the 1963 edition used in Cycle II.
entire range of the skills in question. It is appar-
ent that these criteria do in effect exclude such
standard school achievement batteries as the
1946 EDITION OF WRAT
Stanford, Iowa, Cooperative, and other well-know
and highly respected batteries that are designed
The conceptualization and rationale of this
for group administration within a narrow grade
test (302) could not help but appeal to clinical psy-
range and cover a large universe of content,
chologists in schools and mental health services.
requiring considerable time to administer and
Jastak made an extremely strong case for the
score. These criteria certainly appear to be
clinical use of his test, and it is not surprising
tailor made for the Survey (as well as for
that the WRAT has enjoyed considerable popu-
clinical practice). However, in view of the test-
larity in clinical circles despite psychometri-
ing conditions for individually selected members
cian prejudice against wide-range tests.
of the national sample, the question is, how well
Jastaks arguments are briefly as follows:
are they implemented in the WRAT?
1. A thorough psychological examination Jastaks views on test content are of partic-
should include tests of school fundamen- ular interest. The WRAT focuses entirely on
tals as well as intelligence tests. In- three basic school study skillsreading, spelling,
telligence tests account for only a portion and arithmetic around which most school stud-
of the Variance in school achievement, and ies revolve. The range of the subtests for each
failure in school and life adjustment may is indeed wide, from kindergarten to college.
result from factors other than low in- The test content is concerned principally
telligence. with mastery of the mechanics of the subject

24
rather than with comprehension. Thus the reading came a favorite of a large number of clinicians,
test is in effect a test of reading as a motor and its use was extensive hi the United States
skill; the spelling test focuses on words without and abroad within a short time of its publication.
sentence contexts; and the arithmetic test in- It may appear surprising that so popular a test
volves number facility with minimal dependence generated so little res?arch. However, it appears
on reading. that the principal use of the test was by clinicians
This emphasis is a reflection of the authors whose attitudes toward tests are usually validated
conception of the WRAT as an adjunct to tests of more by clinical experience than by statistics
intelligence and behavior adjustment. Information and. whose opportunities and motivations to con-
concerning the subjectts &ility to comprehend duct and publish research are generally limited.
can be obtained from intelligence tests, but ac-
curate measurement of mechanics in the basic
tools chosen is essential because of the depend- RESEARCH ON THE 1946 WRAT
ence of most other studies on them. Further, it
is argued that correct answers can often be given It is noteworthy that only seven research re-
in conventional reading, arithmetic, and other ports have been found dealing with the 1946 edi-
subject-matter achievement tests on the basis of tion and that of these seven, two were unpublished
general knowledge and intellectual ability, even mimeographed papers (303 and 306) furnished by
when mastery of mechanics is poor; thus, im- Dr. Jastak. Reliability coefficients and corre-
portant diagnostic cues are overlooked. lations of the WRAT with other tests, abstracted
Although the WRAT Reading and Arithmetic from these reports and the two test manuals (301
tests were reported to correlate satisfactorily and 302), are reported in tables 4 and 5.
with other achievement tests, their limitations of
content and intended use were clearly outlined in Reading
the manual.
As stated above, the 1946 edition of the WRAT Hopkins, Dobson, and Oldridge (304) quoted
was standardized by anchoring the WRAT norms to Sundberg (312), in a 1961 paper, to the effect that
those of corresponding subtests of the New Stan- although the WRAT was the second most popular
ford Achievement Test. The standardization achievement test in clinics, Sundberg could not
sample consisted of the scores of 4,052 students find a single empirical study of it. They adminis-
for Spelling and Arithmetic (about 1,500 were tered the Reading subtest to 502 children in
individually tested the remainder were tested in grades 1 to 5 and correlated the scores with
groups) and 1,429 students, individually tested, teacher ratings and scores on the California
for Reading. Reliability coefficients (retest) were Reading Test (CRT). The correlations with teacher
reported as 0.95 for Reading (N=l 10) and 0.90 ratings were high for grades 1 to 50.79, 0.74,
for Arithmetic (N=120). The Reading section of 0.86, and 0.85, respectively. The correlations
the New Stanford Achievement Test was reported with the total score of the California Reading
to have correlated 0.81 with Paragraph and Word Test were 0.86 for grade 3 and 0.71 for grade 5.
Reading; the Arithmetic section of the Stanford The mean grade placements on the WRAT, for
test correlated 0.91 with Arithmetic Computation. the five grades in order, were 1.4, 2.4, 3.5, 4.1,
The detailed composition of the various sam- and 4.7.
ples was not reported in the 1946 manual, and Wagner and McCoy (303) reported correla-
the validation data were not specified by age level tions of the WRAT Reading subtest with the
as would be required to conform with the evalua- Sangren- Woody Silent Reading Test (grade level)
tive criteria discussed above. This was not ex- for two samples, one of 29 fifth graders and the
ceptional in 1946, however, when the professional other of 57 primary school juvenile offenders.
demands for rigorous reporting of critical infor- The correlations were 0.78 and 0.74. In the first
mation by test publishers were less stringent sample, the WRAT Reading correlated 0.78 with
than they are today. both teacher ratings and with rank order of mid-
Nevertheless, despite the absence of com- term grades. The correlation with the Stanford
prehensive statistical information, the WRAT be- Reading Test, in the second sample, was 0.80.

25
Table 4. Studies reporting reliability coefficients of the WRAT

I
Type of Subtest Num- reliability Subtest Num. ~eliability
Investigator [ear ;ubjects Age range ber coefficient
coefficient of WRAT of WRAT ber coefficient

-
Jastak and
Bijou (302).

Jastak (301)-
1946

1963
+Ornlalsa
-

~.R------
Cest-retesl

;plit-half
N.R.

-----------
Reading -----

Reading,
Level II.
1Lo

-----
0.95

T
Arithmetic--

-----------. Arithmetic-- -----

t
120 0.90

-----------

20+ years -----------. 200 0.99 ------------ 200 0.97

18-19 years -----------. 200 0.98 ------------ 200 0.97


16-17 yeais 200 0.99 ------------ 200 0.95
15 years 200 0.99 ------------ 200 0.97
14 years ----.------ - 200 0.99 ------------ 200 0.96
13 years ----------- . 200 0.99 ------------ 200 0.96

12 years 200 0.99 ------------ 200 0.94

Reading, ----- ------------ Arithmetic- ,-. -----------


Level I.

11 years 200 0.99 ------------ 200 0.95

---
10 years 200 0.99 ------------ 200 0.95
9 years ------------ 200 0.99 ------------ 200 0.94
a years ----------- . 200 0.99 ------------ 200 0.95
7 years ----------- . 200 0.99 ------------ 200 0.96
6 years ----------- - 200 0.99 ------------ 200 0.96

5 years ----------- - 200 0.98 ------------ 200 0.97


I
;tandard- ~orm I witt ----------- Reading----- ----- ,----------. 4rithmetic-- .---- -----------
ization Form II.
popu- t
lation.
4-o - 14-11 ------------ 89 0.88 87 0.86

1
3-o - 13-11 -----------. 224 0.90 ------------ 194 0.87
2-6 - 12-11 -----------. 180 0.94 ------------ 165 0.85
2-O - 12-5 -----------. 179 0.92 ------------ 164 0.86
1-6 - 11-11 ------------ 252 0.91 ------------ 225 0.85.
1-0 - 11-5 -----------. 197 0.91 ------------ 191 0.82
3-6 - 10-11 -------,
----- 214 0.93 ------------ 195 0.89

9-0 - 10-5 ------------ 207 0.90 ------------ 190 0.84


9-6 - 9-11 -----------. 165 0.91 ------------ 160 0.79
9-o - 9-5 -----------. 81 0.90 ------------ 78 0.88

aLevel of subjects and time interval between tests not reported.


NOTES: All correlation coefficients are Pearson Product-Moment unless otherwise specified.
N.R.Not reported.
Table 5. Studies reporting correlation between the WRAT and other measures

Number
Investigator Test or criterion variable Subjects Age range correlation

~ M F

WRAT Reading Test

Smith (126)-------------- L961 Fu11 Range Picture Vocabulary Normals, ;-11 - 8-10 LOO 51 49 0.42
Test. Grade 2.

Hopkins, Dobson, and L962 :alifornia Achievement Test------- Normals------ N.R. ?57 --- ---
Oldridge (304).
Reading Vocabulary -------------- Grade 3----- N.R. L71 --- --- 0.83
Grade 5----- N.R. 86 --- 0.67

Reading Comprehension ----------- Grade 3----- N.R. L71 --- --- 0.84
Grade 5----- N.R. 86 --- --- 0.67

Total Reading------------------ Grade 3----- N.R. L71 --- --- 0.86


Grade 5----- N.R. 86 --- --- 0.71

Smith (126)-------------- 1961 :alifornia Test of Mental Maturity Normals, N.R. LOO 51 49 0.47
Grade 2.

Lawson and Avila (305)--- L952 ;ray Standardized Oral Reading Mental de- .6-45 years 30 19 11 bo.94
Paragraphs Test. fective.

Reger (307)-------------- 1962 metropolitan Achievement Tests, Retarded )-9 - 14-6 25 --- --- 0.76
Reading. boys .

Wagner and McCoy (303)--- Y.R. qidterm grades -------------------- Normals, N.R. 29 --- --- 0.78
Grade 5. rank order)

Jastak and Bi.jou (302)--- L946 Stanford Achievement Test,Reading- Normals , N.R. 189 --- --- ------------
:$d;s 7

Word Meaning --------------------- .------------ - N.R. 189 --- .-. 0.84


Paragraph Meaning ---------------- ------------ . N.R. )89 --- --- 0.81

Wagner and McCoy (303)--- N.R. Sa&en-Woody Reading -------------- N.R. 86 --- --- -----------

Normals, N.R. 29 --- --- 0.78


Grade 5.
Juvenile of- N.R. 57 --- --- 0.74
fenders.
Stanford Reading Tests ------------ Juvenile of- N.R. 47 --- --- 0.80
fenders.

Teacher rating of reading ability- Xonnals, N.R. 29 --- --- 0.78


Grade 5.

Hopkins, Dobson, and 1962 reacher rating of reading ability- Normal s-------------------- 502 --- --- -----------
Oldridge (304).
Grade 1------ N.R. 90 --- --- 0.79
Grade 2------ N.R. 106 --- --- 0.74
Grade 3------ N.R. 171 --- --- 0.86
Grade 4------ N.R. 49 --- --- 0.86
Grade 5------ N.R. 86 --- --- 0.85

----------- .
I
Smith (126)-------------- L961 Jechsler Intelligence Scale for Normals, N.R. 100 51 49
Children. Grade 2.

Verbal Score-------------------- --- --- 0.55


Performance Score--------------- --- --- 0.47
Full Score---------------------- --- --- 0.61

See footnotes at end of table.

27
Table 5. Studies reporting correlation between the NRAT and other IIIeaSureS_&,na

Number
Investigator Year Test or criterion variable Subjectsa Age range . Correlation
z M 1

NRAT Arithmetic Test

Holowinsky (309)-------- 1961 California Reading Test ----------- Normals and .2-17 years 600 --- --- 0.61
retarded.

Murphy (306)------------ N.R. First-quarter grades-------------- Normals ----- ---------- . 241 --- --- ------------
Grade 5---- N.R. 135 --- --- 0.64
Grade 6---- N.R. 106 --- -.. 0.56

Holowinsky (309)-------- 1961 Grade placement------------------- Normals and .2-17 years 600 --- --- 0.31
retarded.

Refer (307)-------------- L962 Metropolitan Achievement Teats, Retarded 9-9 - 14-; 25 --- --- b0,87
Arithmetic. boys .

Jastak and Bijou (302)--- 1946 Stanford Achievement Tests,Arith- Normals, N.R. 140 --- --- 0.91
metic Computation. Grades 7
and 8.

Holowinsky (309)--------- 1961 Otis Quick Scoring Mental Ability Normals, 2-17 years 500 --- --- 0.30
Tests. retarded.

2-13 years ~.R, --- --- 0.59


3-14 years q.R, --- --- 0.39
4-15 yeara i.R, --- --- 0.54
5-16 years V.R, --- --- 0.02
6-17 years ~.R, --- --- 0.09

Murphy (306)------------- i.R. Stanford Achievement Tests, Arith- Normals


------ 241 --- .-.
metic, and school grades.

Grade 5----- N.R. 135 --- --- 0.59


Grade 6----- N.R. 106 --- --- 0.35

Stanford Achievement Tests, Arith- Normals ----- ---------- 241 ---- ---
metic, and school grades.

Grade 5----- NOR. L35 --- --- 0.75


(Multiple r)

Grade 6----- N.R. 106 --- --- 0.70


(Multiple r)

~Designation of subjects are always white Americans unless otherwise specified,
Spurious correlation with age for small N.
NOTES: All correlation coefficients are Pearson Product-Moment unless otherwise specified.
z Total population; Mmale; Ffemale; N.R.not reported; rcorrelation.

28
_The- report-by Lawson and Avila (305) of a WRAT Arithmetic subtest, as compared with 0.71
correlation of 0.94 between the WRAT Reading for the Reading subtest.
subtest and the Gray Oral Reading Test, adminis- These results are less satisfactory than
tered to a sample of retarded adults ranging those for Reading in the respect that the corre-
widely in age and IQ, is probably inflated because lations reported compare less favorably with those
of the nature of the sample. Similarly, Regers mentioned in the manual. This type of cross-
(307) sample of 25 emotionally disturbed, re- validation is imperative and demonstrates the
tarded boys (age range 9-9 to 14-6) is also quite importance of independent reports to supplement
a diverse population. Reger reported a correlation the data provided in a test manual. To Dr.
of 0.76 between the WRAT Reading subtest and Jastaks credit, however, it should be noted that
the Metropolitan Achievement Test. the Murphy report, in which the lower corre-
Holowinsky (309) had an apparently well- lations appear, is an unpublished paper which he,
designed sample of 600, including 75 chiMren at Dr. Jastak, furnished unsolicited for this review.
each age from 12 to 16 years. Each group was These studies are insufficient for an evaluation of
divided into three categories on the basis of IQ the WRAT Arithmetic subtest, to be sure. As the
scores. The categories were as follows: 80-89 IQ, only information available, they leave the case for
90-99 IQ, and 100-109 IQ. For the total sample of the Arithmetic test without strong independent
600 children, the California Reading Test corre- support.
lated 0.61 with the WRAT Arithmetic subtest.
Students of lower intellectual ability tended to show
better achievement in arithmetic than in reading. 1963 EDITION OF WRAT
For the total sample of 600 children the WRAT
had a correlation of 0.31 with grade placement. Two major changes appear in the 1963 edi-
These limited results tend to support the tion. One is the division of the test into two levels.
claims for the WRAT with regard to concurrent Level I covers the age range of 5 to 12 years;
validity both with other reading tests and with Level II covers the age range 12 years through
grade placement. The evidence is far from suf- adulthood. It is pointed out in the mimeographed
ficient to permit definitive evaluation, and the lack manual for this edition that this change not only
of information on many points is obvious. However, has reduced the time of test administration, but
no contrary evidence was found and as far as these also has increased the number of items at each
papers are concerned, the report for the WRAT level, thereby increasing the already high relia-
Reading subtest is favorable. bility of the test. Indeed, the test has been
lengthened, and the reliabilities have been listed
Arithmetic for samples of 200 each for ages 5 through 11
years (Level I). For Reading, allwith the ex-
The most adequate independent study of the ception of 5 years of agecorrelate 0.99. (Age 5
WRAT Arithmetic subtest is that of Murphy (306), correlates 0.98.) Similarly computed reliabilities
who tested 135 fifth and sixth graders (with for Arithmetic are listed at or above 0.94, with
average IQ of 114) with the WRAT and the Stan- the highest correlation, 0.97, occurring at 5 years
ford Achievement Test (SAT). The correlation of of age. Since these coefficients are based on corre-
the two tests was 0.59 for grade 5 and 0.35 for lations between two forms of the test, they are
grade 6. The correlations between Arithmetic considered by the authors to be inflated. The text
grades and the WRAT were 0.64 for grade 5 and of the reliability section of the manual (301, p.
0.56 for grade 6. Correlations between the SAT 47) states that the reliability coefficients are
and Arithmetic grades were 0.68 for grade 5 and more likely within the range 0.90 to 0.95 with a
0.59 for grade 6. In Regers sample, noted above mean of 0.92. At this level, they do not seem
(307), the WRAT Arithmetic test had a correlation perceptibly higher than the reliabilities reported
of 0.87 with the Metropolitan Achievement Test. in the 1946 manual.
Holowinskys study mentions a correlation of 0.59 The second major change is in method of
bslsveen the IQ scores of 12-year-olds and the standardization. The 1963 manual (301) describes

29
the development of norms and the normative popu- disclosed that specific methods to identify, in
lation sample as follows: individual cases, the size of the independent and
The revised WRAT was administered to separate variances will have to be developed.
school children and adults in a number of Since this is somewhat of a novel and pioneering
states: Delaware, Pennsylvania, New Jersey, venture, it takes more time than routine manual
Maryland, Florida, Washington, and Cali- preparation. The latter quotation is discussed
fornia. No attempt was made to obtain a separately below.
representative national sampling. Nor is. The basis for the present evaluation is, then,
such a sampling considered essential foY a comparison of the content and structure of the
pvopev standardization. (italics added) 1946 and 1963 editions of the WRAT, supplemented
by the limited independent literature on the 1946
The groups of children were selected from edition, reviewed above, and the limited da~a on
schools of known socioeconomic levels. The the 1963 edition provided in the manual furnished
IQs of the children were also known from by the author. No independent studies of the 1963
group tests such as the Lorge-Thorndike, the edition were available.
Kuhlmann-Anderson, and the California Men-
tal Maturity Test, administered at the Comparison of the Two Editions
schools. Many of the cases (over 1,000) in
the standardization group had been given Examination of the two bmklets indicates
individual tests such as the Stanford- Binet, close similarity in item content, format, adminis-
Wechsler Intelligence Scale for Children, tration, and scoring. The Reading test for Level
and others, ln each age bvacket, probability I, in the revised edition, contains 55 words that
samplings based on IQ~s were studied to de- were in the 1946 edition, and their rank order of
velop WRAT norms that would correspond to sequential position in the two editions is about
the achievement of mentally average gvoups 0.99. It is presumed that the 20 new words were
with representative dispersions of scores empirically calibrated to fit into the previously
above and below the mean. (italics added) established word order. The arithmetic items of
From the standpoint of the Health Exami- the new test are of the same general type as in
nation Survey, with particular reference to Cycle the earlier test, although the format is slightly
II (children aged 6-11 years), the first of the two different and the number of items is increased.
mentioned changes is an advantage. The age In view of this similarity, it appears reason-
range of Level I fits the age range of Cycle II able to expect that the network of correlations of
perfectly, and the increased length of the test the revised test with other measures would be
and more extensive reliability studies reported approximately the same as that reported for the
support the claim of excellent reliability. The 1946 edition. In fact, the correlations might even
second change, in standardization and norm be slightly higher as a result of the greater
development, does, however, present a potential length of the revision. To the extent that con-
problem which is accentuated by the absence of current validity could be accepted for the 1946
validity data. This is discussed below. edition, therefore, there is no reason to doubt
that it will be upheld with the 1963 edition. Al-
though the data are quite inadequate, tentative
Validity and Norms acceptance on this point appears warranted,
based on the authors reputations and the state-
Although published in 1963, the validity sec- ments in the manual. However, this is only part
tion of the revised WRAT was not available for of the problem.
review until late in .June 1964. The delay was
explained by the author of the test as occasioned Validation of 1963 Edition
by comparison of the WRAT with a number of
other tests in order to determine the meaning It is equally important to be able to mearling-
and diagnostic value of the three subtests in re- fully interpret the grade ratings, standard scores,
lation to other abilities. In addition, his letter and percentiles in relation to individual age and

30
grade placement and in relation to population performance are important, as are other sources
parameters. in the absence of empirical infor- of perturbation attributable to deviations of abili-
mation on this issue, nothing definite can be con- ty, personality, and physical and social factors.
cluded. It is appropriate to raise some questions The absence of such data for the 1963 WRAT is
which have been generated by statements made in certainly not the sole responsibility of the author-
the 1963 manual. publisher; ordinarily test producers do not assume
In the first place, the reviewer would take responsibility for all possible research of interest
issue with the test authors statement that a to all possible users. If a test attracts interest,
representative national sampling is not essential information about it in various situations gradu-
for proper standardization. A national sample is ally accumulates in the Literature. However, in
certainly necessary if national norms are to be the present case it appears fair to say that the
promulgated. Although the 1946 edition was de- authors confidence in his test led him to pubIish
veloped on a restricted (as opposed to national) the revision before he had completed his own
sample, its norms were presumably keyed to the research and before research on it by any users
grade norms of the New Stanford Achievement could be reported. The test was issued without
Test, for which a more extensive base existed. a formal designation of the norms as tentative
Even though regional, ethnic, and other perturbing and without any qualifications.
effects were not known, it was at least possible
to invoke the Stanford norms in interpreting grade Validity variances
levels. With the 1963 edition, however, no such
anchoring process was followed. The only indi- Instead, the 1963 manual (301, p. 2) concludes
cations concerning age-grade levels are, in fact, its introductory section with the following para-
disquieting. graph:
The manual goes on to say that intelligence
In addition to the three operational aspects
quotients of a number of group and individual
(of mechanics and comprehension in relation
tests (which are generally known to vary in level
to each skiIl test) the basic skills have sever-
among themselves) were used to select samples
al unique validities which will be explained
in each age bracket that would correspond to the
later by reference to appropriate research.
achievement of mentally average groups mth
The validity variances will not only support
representative dispersions of scores above and
the empirical distinctness of mechanics and
below the mean. (italics added) It would indeed
comprehension, but will provide the degrees
be remarkable if such a procedure could produce
to which each is important in learning to
a standard reference sample of known character-
read, spell and figure and the impact the
istics for normative purposes. Therefore it is
relationship between them has on the total
doubtful that the resulting norms could have de-
learning process.
pendable accuracy for individual assessment or
for analysis of groups in the manner required The burden of proof is on the author. The
for the national sample of the Health Examination development of such an analytic scheme for inter-
Survey. Perhaps the test authors current con- pretation of test scores is indeed both novel and
cern with comparisons with other tests, referred ambitious and deserves all the time required to
to above, reflects realization of this problem. complete it. It seems regrettable, however, that
Furthermore, in view of the professed clini- the test was released before critical users could
cal purposes of the WRAT, it is surprising that the evaluate not only these devices, but even the grade
standardization research is confined to mentally ratings, percentiles, and standard scores included
average groups, and that no studies were under- in the manual.
taken of such groups as gifted pupils, students
retarded in reading, arithmetic, and other school Validity Data in 1963 Manual
subjects, disturbed children, and subnormal chil-
dren. The section of the manual entitled Validity
For the purposes of a national survey, prob- of the WRAT (301, p. .51), contains a table of
lems of ethnic and regional variations in test means and standard deviations of raw scores for

31
the Reading, Spelling, and Arithmetic subtests, In view of the composition of the sample, these are
which indicates considerable need for refine- surprisingly low.
ment of the tests in order to produce an even The manual also reports (301, p. 55) cor-
progression of scores from grade to grade. The relations of WISC Verbal Scale, Performance
difficulties are considerable at some levels (8.0 Scale, and Full Scale with the WRAT (1963), with
to 8.5, 9.5 to 10.0, and 10.5 to 11.0, on the Read- samples covering narrower age ranges of 5 to 7
ing test, for example), to say nothing of the fact years and 8 through 11 years. The results here
that the basic difficulties reported about the are the most impressive concurrent validity data
standardization sample are not only not clarified, in the manual, although they indicate correlations
but are not even referred to in this section of the in the 0.6 to 0.7 range with intelligence rather than
manual. achievement criteria, for which the y are intended.
Two paragraphs on the validity of the Read- As stated several times earlier, the accuracy
ing test (301, p. 50) refer only to the studies of score levels in the WRATnorms is regarded as
cited above, which involve the 1946 edition of the a more pressing problem for empirical demon-
WRAT. No validity data on the 1963 edition are stration than the concurrent validity (covariation
presented. Similarly, data are presented (301, with related measures) of the test. On this point
p. 52) on correlations of the WRAT with achieve- the validity section of the manual is silent.
ment tests and on the validity of the Arithmetic
subtest, but these are also identified as relating Grade Equivalents
to the 1946 edition.
Internal consistency data cited by the author The 1963 manual (301, p. 22) states that grade
(301, p. 53) involve intercorrelations among the norms were derived from the actual mean grade
three WRAT subtests and not validity, despite the levels of the children in each grade group. De-
authors assertion that criteria of internal con- spite variations in school grade-placement prac-
sistency, if properly interpreted, are usually tices over time, grade rating is characterized as
more valid than are external criteria of com- rather stable. The manual further asserts
parison. These data are also presented as one striking comparability of grade ratings of the
method of cross -validation. old and the new WRATS through nearly all edu-
Correlations of the Wide Range Achievement cational levels except the upper ranges. Grade
Test with the California Test of Mental Maturity ratings below 14 years of age are said to be less
are given (301, p. 54) for a sample of 74 children arbitrary than grade ratings over 14 years of age.
spanning the age range of 5 to 15 years. They The grade scores are intended to be comparable to
~ange from 0.74 to 0.84 and may be spuriously mental ages.
high in view of the heterogeneity of the sample.
Similarly structured comparisons with the .WISC Standard Scores
for 300 boys (aged 5 to 15 years) and 244 girls
(aged 5 to 15 years) are reported which indicate The WJSAT standard scores can be converted
correlations as follows: from raw scores by age group in a table provided
in the manual. The standard score has a mean of
100 and a standard deviation of 15 and is intended
to be equivalent to an IQ from the WAIS, WISC,
Sex and test Reading Arithmetic Stanford-Binet (Form L-M) or any of the major
intelligence scales. Although these scales are not
Boys comparable themselves (as developed in some
Vocabulary l-------- 0,65 0.56 detail in section I of this report), tie manual states
Block Design ------- 0,41 0.41 that the results from the WRAT test can thus be
Girls directly compared with the major individual in-
Vocabulary l-------- 0.56 0,56 telligence scales.
Block Design ------- 0.39 0,50 The standard score is asserted to be the
most precise and most meaningful score. It is
1
Based on Jastaks short-form revision (31 1). the only score that is comparable between sub-

32
tests and that provides for uniform differences of the questions left unanswered by the authors
between scores. manual. Moreover, analysis of the available in-
formation on the 1963 edition raises doubts about
Percentiles normative score levels.
The selection of the WRAT over other avail-
Percentiles are included because of their able school achievement tests may be defended on
present popularity and convenience , but the the grounds of administrative expediency and
manual appropriately downgrades them anddis- suitability of the material for the purposes of
courages their use. the Survey, in spite of the fact that inadequate
data exist to support the authors claims of va-
SUMMARY AND CONCLUSIONS lidity. It is possible that such data may be pro-
duced, and every effort should be made to obtain
The foregoing review of the WRAT is neces- them. However, unless these results are con-
sarily incomplete because of lack of adequate vincing-and reason to doubt that they will be
information on which to base a technical evalua- has been expressedit is recommended that
tion. The test is well conceptualized and has much serious consideration be given to carrying out a
face validity, but standardization information on complete restandardization of the Reading and
the 1946 edition was inadequate, and on the 1963 Arithmetic subtests on the entire national sample.
edition it is thus far insufficient. Unless this is done, projections of estimates to
Published research on the 1946 WRAT has population may be seriously in error.
been extremely limited and fails to answer most

BIBLIOGRAPHY

Research References and Manuals 302. Reger, R.: Brief tests of intelligence and academic
achievement. PsychoJ.Rep. 11:82, 1962.
30L Jastak, J. F.: Wide Range Achievement Test, rev. ed.
308. Warren, S. A.: Academic achievement of trainable pupils
Wilmington, Del. Guidance Associates, 1963.
with five or more years of schooling. Train.Sch.Buzl.
302. Jastak, J. F., and Bijou, S. W.: The Wide Range -4chieve- 60:75-88, 1963.
rnent Zeet. Wilmington, Del. C. L. Story CO., 1946. 309. Holowinsky, I.: The relationship between intelligence
303. Wagner, R. F., and MCCOY,F.: TWOvalidity studies of (80-110 I. Q.) and achievement in basic educational
the Wide Range Achievement Reading Test. Personal skills. Train .Sch.BuU. 58:1422, 1961.
communication.
304. Hopkins, K. D., Dobson, J. C., and Oldridge, O. A.: The Other References
concurrent and cnngruent validities of the Wide Range 310. Chauncey, H., and Dobbin, J. E.: Testing, Its place in
Achievement Test. Educ.Psychol.Measur. 22:791-793, Education Today. New York. Harper and Row, 1963.
1962.
311. Jastak, J. F., and Jastak, S. R.: Short forms of the WAIS
305. Lawson, J. R., and Avila, D.: Comparison of Wide Range and WISC vocabulary subtests. J. CZin.Psychol. 20:167-
Achievement Test and Gray Oral Reading paragraphs 199, 1964.
reading scores of mentally retarded adults. Percept.Mot.
312. Sundberg, N. D.: The practice of psychological testing
Skills 14:474, 1962.
in clinical services in the United States. Am. Psycholo-
306. Murphy, G. M.: An inveetigation of the utility of mathe- gist 16:79-83, 1961.
matics sub-test from the Wide Range Achievement Test,
as applied to intermediate level groups. Persoml comm-
unication.

33
Ill. THE GOODENOUGH DRAW-A-MAN TEST

BACKGROUND AND DEVELOPMENT drawing tests focused on personality study have


used two or more drawings. For example, Mach-
A comprehensive historical survey of the over (596) instructs the subject to draw a person
study of childrens drawings appeared recently and then to draw a person of the sex opposite to
in an important new book by Dale B. Harris (522), the one previously drawn, while Buck (594) uses
a former colleague of Florence Goodenough and drawings of a house, a tree, and a person. In
apparent successor to her in the leadership role general, the cues and signs interpreted imperson-
in the measurement of childrens intelligence by ality study of drawings are different from those
point scales based on drawings of the human employed for the measurement of intelligence.
figure. The present review does not duplicate
Harris scholarly survey, but focuses more Point-Scoring System
specifically on the problems of the Goodenough
Test as used in the Health Examination Survey. The point system developed by Goodenough
The first formal intelligence test based on (595) for drawings which can be recognized as
the analysis of childrens drawings was published attempts to represent the human figure-no matter
by Florence Goodenough (595) in 1926, but the how crudeinvolves the presence or absence of
literature on this subject goes back at least to 51 detailed points, which are listed as follows:
1885 (595, ch. I). Some of the early papers are
l-4a Head, legs, arms, trunk ~resent
summarized in this study, but the major emphasis
has been placed on recent critical research on the 4b Length df trunk greater than breadth
Draw-A-Man Test and its variants. Nevertheless, 4C Shoulders definitely indicated
it is of interest that in 1893 Herrick (501) demon-
5a Attachment of arms and legs
strated the developmental significance of profile
5b Legs attached to trunk; arms attached to
drawings and that in the same year Barnes (502)
trunk at correct point
recognized that drawings are used by young chil-
dren as a means of expressing their ideas. Mean- 6a Neck present
while, Lukens (503), in 1896, outlined many details 6b Outline of neck continuous with that of
of human figure drawings which were later in- the head, of trunk, or both
corporated in the point-scoring systems of Good- 7a-c Eyes, nose, mouth present
enough (595) and of Harris (522). 7d Both nose and mouth shown in two di-
The Goodenough Test is referred to in this
mensions; two lips shown
discussion as the Draw-A-Man Test although the 7e Nostrils shown
specific instructions in Cycle II of the Survey are
to make a picture of a person. However, the 8a Hair shown
instructions goon to state that when a bust picture 8b Hair on more than circumference of head;
has been drawn intentionally, the child is given nontransparent
another sheet of paper with the instruction Now 9a Clothing present
make a picture of a whole person. Only one pic- 9b At least two clothing items nontransparent
ture is used. 9C Entire drawing free from transparencies
of any sort; sleeves and trousers shown
Rationale 9d At least four clothing items definitely
indicated
In this procedure emphasis is placed on the
9e Costume complete without incongmities
representation of details in the drawing to measure
conceptual maturity. Drawing technique is mini- 10a Fingers present
mized, and distortions potentially usable as cues 10b Correct number of fingers shown
for personality evaluation are not scored. Recent 10C Detail of fingers correct

34
10CI Opposition of thumb shown Correlations with Stanford-Binet were 0.76 for
10e Hand shown as distinct from fingers or mental ages and 0.74 for intelligence quotients.
arm The experimental work, analysis, and reporting
which characterized this undertaking would be
lla Arm joint shown (elbow, shoulder, or
regarded as impressive today, and the critical
both)
reader of Goodenoughs book can well appreciate
llb Leg joint shown (knee, hip, or both)
Lewis M. Termans description of it (in the fore-
12a-e Proportion: head, arms, legs, feet, two word) as a notable accomplishment.
dimensions
13 Heel shown Perspective

14a-f Motor coordination In 1950, a quarter of a century after the pub-


a Lines reasonably firm and joining usually lication of her book, Goodenough collaborated with
accurate Dale Harris in a review (510) of the extensive lit-
B Increased firmness of lines and increased erature generated by her test. This review was
accuracy of line junctions critical of many studies of graphic expression
c Head outline free from unintentional ir- that lacked quantification, but it acknowledged the
regularity value of drawings used projectively as a source
d Trunk outline free from unintentional ir- of diagnostic cues. Goodenough and Harris made
regularity speciaI note of some writers attempts to attribute
e Arms and legs without irregularities, discrepancies between the Draw-A-Man Test and
narrowing at point of body junction the Stanford-Binet (in which Draw-A-Man IQs
f Features symmetrical are markedly lower) as possible diagnostic cues
of emotional or nervous instability or of brain
15a Ears present
damage. They also cautioned about the use of the
15b Ears in correct position and proportion
Draw-A-Man Test in cross-culturai comparisons,
16a-d Eye detail, brow, lashes, or both shown; pointing out that the Draw-A-Man is not a cwUure-
pupil, shown; proportion; glance fiee test, as many users have incorrectly as-
sumed. This point is most dramatically illustrated
17a Both chin and forehead shown
by the Near Eastern study of Dennis (555).
17b Projection of chin shown; chin clearly
In the Fowth Mental Measurement Year-
differentiated from lower lip
book, 1953, Stewart (514), while presenting a
18a-b Profile drawings very favorable evaluation, suggested that the
Goodenough norms might require revision due to
social changes which have occurred since the
Standardization original standardization. Such a revision was
apparently justified, and the new Goodenough-
In Goodenoughs original research, point Harris Drawing Test (552), published in 1963,
scores based on these items were equated to age fills an important need. This modified procedure
norms from which intelligence quotients could be consists of three drawings: a man, a woman,
computed in the same manner as in the Stanford- and yourself. Separate point scales are pro-
Binet test. Data on reliability and validity were vided for drawings of men and drawings of women;
reported in the 1926 book (595) and also in a separate norms are also provided for drawings
monograph (504) published the same year. Using made by boys (men) and drawings made by girls
a basic standardization sample of 5,627 school (women).
children from kindergarten to the sixth grade aged An empirical study on a sample of 195 draw-
4 to 12 years, split-half and retest reliabilities ings taken from the Health Examination Survey
were computed. A split-half reliability of 0.77 population, in which the Harris scoring and norms
(corrected) was found to be constant from 5 to 10 were compared with the original Goodenough
years of age, and a retest reliability coefficient scoring and norms, is reported below. This study
of 0.94 was reported for 194 first-grade children. supports a recommendation that the Harris revi -

35
sion be aaupted for scoring the Goodenough test in as health, emotions, and attitudes, and external
this Survey. environmental factors affect the drawing content.
In the present review, studies have been found
EVALUATION OF INTELLIGENCE which demonstrate the influence on drawings of
factors such as height and weight (543), sex and
BY HUMAN FIGURE DRAWINGS body image (512, 537-539, and 541), physical
handicaps (57 1 and 572), mental age (521), affec-
Effective Range tive states experienced and experimentally in-
duced (529, 530, and 532), institutionalization
Barnes (502) early observation that children (540), teacher attitude (533), sociometric popu-
draw candidly up to about 14 years of age and larity (534), social acceptance (531), and social
then more abstractly is supported by Barnhar.t class (536).
(507), who described three types of drawings Although size of drawings appears to increase
schematic (graphic representation), predominat- with mental age over the effective range of the
ing in the age range 5 to 9 years; mixed, in the Draw-A-Man, size standards have not been incor-
range 8 to 13 years; and visual Yealistic (abstract- porated in any of the published point scores. In
ed, esthetic, nonspecific as to factual details), general, the studies referred to in the preceding
principally in the range 10 to 16 years. This paragraph may be viewed as minor perturbing
apparently explains why the point scores cannot influences within a homogeneous cultural frame-
be validly extended above 14 years of age (522), work. Variability among drawings attributable to
The increase in point scores with age, up to perturbing factors of the types enumerated within
14 years of age, apparently reflects mental matur- the social boundaries of the American culture
ity and not chronological age. This was noted by appears to have significance for the study of
Smith (506) and by McElwee (524), who reported personality and social behavior, but it does not
a correlation of 0.72 between the Draw-A-Man appear to influence measures of intelligence de-
and the Stanford- Binet mental ages for a sample rived from childrens drawings in the age range
of 45 subnormal 14-year-old children. Israelite 5 to 12 years.
(562) found a correlation of 0.71 between the
Draw-A-Man and the Stanford- Binet for 256 men- Culture
tal defective. Others have also successfully
test ed mentally defective adults with the Draw-A- The factors which influence childrens draw-
Man Test. ings of the human figure most are those that re-
flect the effects of a cultures customs and
Relation to Artistic Ability values, since these determine the way in which
children are exposed to different representations
An area of special interest in the interpreta- of the human figure in dress, art, photographs,
tion of childrens drawings has been the relation religious practices, and sex roles and attitudes.
of drawing maturity, as reflected in point score, Hunkin (554) found the Goodenough norms inap-
and artistic ability, Goodenough acknowledged that plicable to Bantu school children, and Dennis
drawings could be influenced by special coaching (555) attributed the steady decline in mean Draw-
(as can most human responses) but that ordinary A-Man IQ from 5 to 10 years of age (among
art instruction in school has little effect on the Egyptian and Lebanese children in the Near East)
Draw-A-Man score. She reported a correlation to the Arab culture, which restricts access to
of 0.44 between the Draw-A-Man and teacher representations of the human figure. Studies of
ratings of drawing ability (504). the Draw-A-Man with children of various Ameri-
can Indian tribes on reservations (558-560) have
Perturbing Factors produced varying results which may perhaps be
understood only in the context of their respective
Intelligence scores based on drawings are culture patterns.
relatively independent of artistic ability. However, On the other hand, Anastasi and DeJesus
there is evidence that both internal factors, such (556) found sex differences in agreement with

36
Harris, discussed below, but found no ethnic dif- this age group also the Eskimo is more
ferences in a comparison of Draw-A-Man scores likely to draw the arms down at the side
of .50 Puerto Rican children of low socioeconomic th~ held out stiffly from the body. The Es-
class in New York City with those of Negro and kimo child is more likely to show the feet
white children of similar status which were re- with a wide stance, that is, with toes pointing
~rted by other investigators. Similarly, Levinson apart, or in perspective in either full-face
(243) found that the Draw-A-Man, as well as WISC or profile drawings. The Eskimo drawings
Block ~sign, is culturally fair for native-born include fewer transparencies in these age
Jewish biIingual children in New York City. groups, and a larger percentage of them earn
The importance of taking into account cultural credit for showing a distinct costume, which
variations when dealing with a heterogeneous pop- of course follows from the tendency to draw
ulation such as that sampled by the Health Exami- the parkathe everyday costume in this part
nation Survey is illustrated by the following quota- of Alaska.
tions from Harris (522, pp. 131 and 132). These
Aspects of the Eskimo drawings that are dis-
quotations have been exerpted to illustrate how the
tinctive and that are not apparent in the de-
customary dress of Eskimo children affects point
tailed scoring technique of the Goodenough
scores on drawings of the human figure.
method include: a greater emphasis on the
eyebrow, on the nostrils and nose (as in-
Eskimo children are less likely to depict the
dicated above), and on general detail of facial
neck, the ears, and to correctly place the
features. There is some evidence of a general
ears. These facts seem to reflect the greater
decrease in quality of the drawing in adoles-
prevalence of parkas in the Eskimo groups
cence. This is not sufficiently great, however,
drawings and [this] is thus an artifact of the
to reveal itself markedly in the trend of
drawing situation. Due to the voluminous
median scores as in the normative group. It
parka garments, elbow joints, knee joints and
is most noticeable in the increased tendency
modeling of the hips are less likely [to be]
to draw the facial features and hands sketch-
shown, resulting in greater stiffness of fig-
ily. Particularly among young Eskimo chil-
ures portrayed.
dren there is a very distinct tendency to draw
Since the Eskimo boot does not have a heel, shorter arms and legs than in the norm group.
Eskimo chihlren are less Iikely to indicate Here again there is the possibility that the
heels in their drawings. [Several instances], proportions of the bodyare distorted some-
however, show that when the garb is appro- what by so many children depicting the fig-
priate, the heel is shown. The children do ures in parkas.
have the concept of heels; their drawings are
Cultural factors influence drawings in many
quite appropriate to the type of figure they
obvious ways such as type of garb, vehicles, im-
are representing at the time. Eskimo chil-
plements, and actions portrayed, but the nature
dren are aIso less likely to portray the arm
of the influence on a Goodenough-type point score
and shoulder performing some type of move-
is subtle, as illustrated in the preceding quota-
ment, probably due to the loose parka, though
tions from Harris. Because such variations are
this is not invariably the case.
often inconsequential within the mainstream of
On the other hand, Eskimo children are more American culture, there has been a wide tempta-
likely to portray with exactness the nostrils, tion to use the Draw-A-Man as a culture-free
the bridge of the nose, and, when portrayed intelligence test. Nevertheless, as Harris prop-
at all, the thumb or fingers. The character- erly insisted (522, p. 133), the data . . . suggest
istic tendency of the Eskimo children to show that the childs, drawing of certain Imdy features
a mittened hand earns for them a greater or parts is ~nfluenced by garb, and possibly by
credit on the thumb opposition point and on other conditions of living that call attention to
the hand as distinct from fingers or arm in particular parts or their functions. Allowance
the age group ten to thirteen inclusive. In would have to be made, both in scoving and in

37
the novms, foY parts omitted in one of these different segments of the population but also re-
cwltwes included in the pyesent scoring system. ceived widely varying prominence in different
Swch allowance would have to be worked out em- localities. Although this is an extreme example,
pirically within each culture gvoup. (italics it is nevertheless possible that some children
added) might draw the female figure appropriately re-
Goodenough and Harris (510), in their 1950 flecting a sophisticated transparent garment and
review, affirmed that although the test may be be penalized on the point snore for what could be
unsuited to comparing children ac?oss cultures, considered a bright response.
it may still rank children within a culture accord-
ing to relative intellectual maturity. In his 1963 Sex Differences
publication (522, p. 133) Harris has further amend-
ed this position to state that for the most valid Both Goodenough (504) and Harris (522) have
results, the points of the scale should be re- reported qualitative and quantitative differences
standardized for every group having a distinctly in drawings which are related to the sex of the
different pattern of dress, mode of living, and person doing the drawing. Harris more recent
quality or level of academic education. In Harris work is of greater relevance. He believes that
judgment, This conclusion virtually rules out the these sex differences cannot be attributed to dif-
scale for cross-cultural comparisons; indeed, ferential selection of boys and girls according
psychologists increasingly believe that mean dif- to intellect. Harris recent data show that sex
ferences among large, representative samples differences in total point scores appear at an
drawn from varying cultures express the gross early age and are considerably greater than those
differences in conceptual experience and training reported by Goodenough. Harris found that for the
these groups have had. Further work, to determine drawing of a man, the mean score difference favors
exactly which aspects of intellectual or conceptual girls by about one-half year of growth at each year
maturity the drawing task expresses, will be of age, while for the drawing of a woman, this
necessary to explain scientifically these observea difference is roughly equal to a full year of growth.
cultural differences. The Harris point scale, applied differentially to
No systematic research such as Harris de- Man and Woman drawings by boys and by girls,
lineated with respect to Eskimo children has been appears to reduce mean differences.
done on the detailed effects of microvariations Sex differences in drawing point scores re-
within the American culture. Yet there is little flect differences in maturation, cultural factors
reason to doubt that subtle differences between including sex role and awarenessand perhaps
urban and rural, industrial and suburban, warm some degree of difference in drawing proficiency.
climate and cold, eastern and western, and other However, it is believed that these will be mini-
prominent contrasting situations within the con- mized by the adoption of the Harris norms and
tinental United States (to say nothing of Alaska scoring system and that the remaining residual
and Hawaii) might produce some significant error probably will be inconsequential. Without
variations. Undoubtedly, some of these subcul- doubt, the error will be smaller than that which
tural variations reflect ethnic factors, such as would result from the blanket use of one uniform
the superstitious reluctance of some southwestern scoring system for the entire population.
children of Mexican origin to draw eyes because
of fear of the evil eye. PERSONALITY STUDY
It is also possible that secular trends, which BY CHILDRENS DRAWINGS
are revealed in the comparison of the 1926 and
1963 norms, may be occurring at differential Although personality evaluation is not the
rates in different localities and segments of the primary reason for including the Draw-A-Man
culture and that these also may subtly affect Test in the Survey, a review of the potentialities
point scores. For example, the high-fashion for such analysis is relevant. Since this topic has
announcements of transparent garments for fe- been covered more extensively by Harris in his
males not only aroused different reactions among recent publication than in this review, the following

38
discussion is organized in relation to Harris Harris also hypothesizes that the male
summary. Below are eight widely accepted but not figure is more culturally stereotyped and
necessarily established generalizations concern- easier to draw than is the female figure.
ing personality measurement by childrens draw- He considers deviates from this norm to
ings. These were evaluated by Harris in his recent be psychologically different IYom non-
book (522, p. 52). As will be noted, several of the deviates. He also feels that the deviation
generalizations are rejected. has different meanings for the two sexes
and has unique, idiosyncratic meanings
1. lhwing interpretation is move valid when
to individuals. Since many deviations from
based on a seyies of a subject~sprotocols the norm occur and since the meaning of
than when based on one dnzwing. Despite
such deviations is as yet unknown, it is
the lack of clear-cut empirical evidence
unlikely that the principle (the figure
on this issue, Harris equates additional
drawn first relates to the image the
pictures as having the effect of increasing
drawer holds of his own sex role) is uni-
the length and therefore the reliability of
versally valid. Therefore, even though
the test. From this logical vie~int, he
about 86 percent of boys and 65 percent
considers it justified.
of girls have been reported to draw their
2. Drawings are most usefil for psychologi-
own sex first, it is not pxsible to for-
cal analysis when teamed with other avail- mulate any reliable interpretation for
able information about the child. This, too, those who do not.
is a logically sound principle, especially
5. A child adopts a schema or style ofdraw-
when it is the content of drawings alone
ing which is peculiar to him and which be-
that is being used for psychological in-
comes highly sigmjlcant psychologically.
terpretation.
Most of the evidence is opposed to this and
3. Free drawings are more meanin@l psy-
suggests rather that developmental pat-
chologically than drawings of assigned
terns do exist among childrens drawings.
topics. This is probably true for certain
6. The manner in which certmn elements are
purpo~es, such as exploration of interests,
povtrayed in drawi~s may be used as
but systematic comparison of individuals,
signs of certain psychological states or
as in a national survey, requires control
conditions in the artist. In agreement with
of the task.
Harris, the present writer regards this
4. When a human figuve drawing is assigned,
statement as one of the eternal, unful-
the sex of the figwe first drawn relates
filled wishful myths of the depth psychol-
to the im~e the drawer holds of his own
ogist. Two particular statements by
sex role. Of the studies summarized in
Harris are relevant to possible further
Appendix III, those most relevant to the
research in this frustrating area. First,
study of children ages 6 to 12 years are
whether or not signs are selected by an
as follows: 512, 537-539, 541, and 542.
empirical or deductive procedure, there
According to Brown and Tolor (541), nor-
is still the question whether form or con-
mal individuals of both sexes tend to draw
tent will provide the cues. Size, quality
their own sex first, while persons with
or texture of line, degree of angularity,
behavior disorders draw the opposite sex
pattern or shape, and placement on the
first. Harris agrees that most children of
page are often thought to be highly signifi-
either sex will draw their own sex first
cant avenues for projecting unconscious
when asked to draw a person. He further
motives or needs. References 512, 521,
elalxmates that as girls grow older there
537, 540, 543, 564, and 566 support this
is an increasing tendency for them to draw
view, but neither form nor content signs
a male figure. This, he feels, reflect~ both
of unequivocal value have thus far been
the cultural preference given to the male
validated. Thus, Harris second state-
role and an increasing dissatisfaction with
ment, that useful and valid signs leading
the female role.
to dependable conclusions are, for the

39
most part, still to be ascertained, dis- Goodenotigh. The reliability of the point scale
poses of this generalization. holds up in the mentally retarded range (523
7. Drawirgs must be interpreted as wholes and 524), and scarer agreement is high (526).
rather thun segmentally or analytically. One problem observed in interscorer com-
This, too, has been a strong sentimental parisons by the reviewer which is mentioned in
favorite, but the evidence is mostly the connection with the Goodenough vs. the Good-
other way, particularly in personality enough-Harris comparison is that while the re-
assessment. In fact, the history of psy- sults of two scorers may show a very high
chometric progress has been away from correlation, there may nevertheless be a constant
global analysis toward specific analysis, difference in score levels between them, reflecting
has favored linear over curvilinear rela- individual idiosyncrasies of their interpretations.
tions, and generally has demonstrated that The safest method of coping with such constant
quantitative procedures are more valid, errors, in a survey in which a number of scorers
even if less spectacular, than those based may be used for different segments of the total
on scorer judgment. sample, would be to have at least two people
score every test and to use the average of the
Harris has cited analytic studies of com-
two for record.
ponent qualities of childrens drawings,
by Martin and Damrin and by Stewart
Correlations With Other Tests
(522, p. 56), which suggest that drawings
are actually appraised in terms of a few
Correlations of the Draw-A-Man with the
general dimensions, although they may be
Stanford-Binet are summarized in table 7, and
rated on a number of specifically defined
its correlations with other tests, in table 8.
elements or qualities. Harris believes
Sire@ tables appear in Harris (522, pp. 96 and
that these studies lend credence to the
97). With few exceptions, correlations of the
belief that broad, dimensional evaluations
Draw-A-Man with the Stanford-Binet (in which
(rather than highly particularistic ones),
coefficients are based on IQs) reported by other
based on such analytic results, may be
investigators have averaged lower than those re-
made more readily and more reliably. He
ported by Goodenough in 1926 (504). The ex-
also believes that they suggest the direc-
ceptions found are Williams (505), Israelite
tion these quantitatively and factorially
(562), White (565), and Ellis (unpublished masters
defined global ratings may take. Their
colloquim paper, University of Minnesota, 11953),
findings in relation to personality quali-
whose data agree substantially with those of
ties, however, are not of such magnitude as
Goodenough.
to support the use of drawings in diagnos-
Unfortunately, most of the publications cited
ing individual cases.
which involve correlations of the Draw-A-Man
8. The use of color in dyawings can be sig-
with the Stanford- Binet and a number of other
nificant for studying personality. This is
tests are based on very small samples (rarely
another popular clinical belief, on which
more than 100), are usually not representative
the empirical evidence is equivocal,
of their respective subuniverses, and do not
always present assurance of testing under standa-
RESEARCH ON THE rd conditions. As a result, the collection of
GOODENOUGH TEST correlation coefficients can only be interpreted
very generally.
Reliability Studies These results indicate a considerable as-
sociation between the Draw-A-Man Test and
Table 6 summarizes the reliability coeffi- general intelligence tests, such as the Stanford-
cients reported for the Draw-A-Man Test in the Binet and the WISC, which measure mental
studies included in this review (523-528). In maturity. The common variance is probably about
general, the reliabilities obtained by independent 50 percent. Maturationally, the original rationale
investigators have confirmed those reported by presented by Goodenoughthat drawing point

40
Table 6. Studies reporting reliability coefficients of human figure drawing tests

Number
Test and eliability
Investigator Year Subjects Age range Type of coefficient coefficient
;coring method
M F r
-
Yepsen (523)---- 1929 ;oodenough----- Feebleminded---- 9.0 - 18.2 37 37

Brill (525)----- 1935 :oodenough----- Feebleminded---- N.R. N.R. --- --- Test-retest
71 71 Administration 1-2----- 0.77
65 62 Administration 2-3----- 0.80
67 67 Administration 1-3----- 0.68

Albee and Hamlir 1949 iuman Figure VA Mental N.R. N.R. --- --- Interjudge------------- 0.95
(579). Drawing, Paired Hygiene C Linic. Spearman-Brown--------- 0.98
Comparisons. Rangenormals
to psychotics.
1
Albee and Hamlir 1950 {achover------- Neurotic, N.R. 72 --- --- Interjudge------------- 0.89
(581). schizophrenic, i
normal.

Hinrichs (586)-. 1935 :oodenough----- Normals --------- 10-18 years 81 .-. --- Split-half, Spearman- 0.88-0.90
Brown.

b
Herron (532)---- 1957 ;oodenough----- Normals, Grades 113 months 16 16 rest-retest, grOup A,
3 and 4. (mean) 0.52
Administration 1-2-----
Administration 2-3----- 0.51
Administration 1-3----- 0.27

28 28 rest-retest, group Ab
Administration 1-2---- 0.79
Administration 2-3----- 0.69
Administration 1-3---- 0.85
1
24 24 rest-retest, group Bb
Administration 1-2---- 0.92
Administration 2-3----- 0.40
Administration 1-3-----
1 0.86

15 15 rest-retest, group Bh
Administration 1-2----- 0.85
Administration 2-3----- 0.73
Administration 1-3----- 0.63

McCurdy (527)--- 1947 ;oodenough----- Normals --------- 13.9 months 59 59 rest-retest ------------- 0.69
(mean)

Buhrer, de 1951 :oodenough----- Normals , 7-14 years 1,936 --- --- ~.R---------------------j 0.97

----------.-
I------
Navarro, and Spanish- I
Velasco (511). speaking.

Frankiel (518)-- 1957 :oodenough and NOnnals--------- 200 100 100


Frankiel.

7 years 100 50 50 [ntrajudge-------------- 0.83


7 yesrs 100 50 50 [nterjudge------------- 0.71-0.84
{
MO 50

1
12 years 50 :ntrajudge------------- 0.89
12 years 100 50 50 :nterjudge--------------O.8l-O.86

1
McHugh (508)---- 1945 :oodenough----- Normals, pre- ;2.0 months 83 --- --- rest-retest ------------ 0.46 (IQ)
school . (mean) 0.51 (MA)

Goodenough 1926 ;oodenough----- Normals--------- 4-12 years 5,627 --- --- ----------------------- j----
(504).

I
;plit-half, Spearman- 0.77
Brown .
I :est-retest,Grade 1 only- 0.94

See footnotes at end of table.


41
Table 6. Studies reporting reliability coefficients of human figure drawing testsCon.

Number
Test and Reliabil~ty
Investigator !ear Subjectsa Age range Type of coefficient
coefficient
;coring method
M F

Williams (505)-- .935 :oodenough----- Normal s-------- 3-15 years LOO 50 50


==+== I

Smith (506)----- .937 ;oodenough ----- Normals -------- ------------ 300 --- --- Cest-retest---------------------.--
6 years Loo --- --- ----------------------- 0.91
7 years 100 --- --- .---------------------. 0.91
8 years 100 --- --- .---------------------. 0.95
9 years LOO --- --- ----------------------- 0.96
10 years 100 --- --- ----------------------- 0.93
11 years LOO --- --- ----------------------- 0.95
12 years LOO --- --- ----------------------- 0.92
13 years LOO --- --- ----------------------- 0.92
14 years LOO --- --- ----------------------- 0.94
15-16 years LOO --- --- ---------------- ------- 0.84

McCarthy (526)-- .944 ;oodenough----- Normals, Grades N.R. 3g6 --- --- ----------------------- -----------
3 and 4.
t
[ntrascorer ------------ 0.94
[nterscorer ------------ 0.90
rest-retest------------ 0.6g
)dd-even, Spearman- 0.89
Brown.

McHugh (529)---- .952 :oodenough----- Yox?nals


Grade
,
3.
N.R. L18 58 60 ----------------------
I -----c-----

[ntrajudge------------- 0.98

---------------
1--.----
[nterjudge-------------I 0.97

Stone (582)----- 1952 Flachover


------- Normals, N.R. 492 --- ---
Grade 6.

Split-half
First drawing--------- 0.32
Second drawing-------- 0.76

Test-retest
Drawings 1 and 2,
males ----------------- 0.56
Drawings 1 and 2,
females--------------- 0.39
Drawings 1 and 2,
total----------------- 0.50
.

~Designations of subjects are always white Americans unless otherwise specified.


Indicates conditions preceding Draw-A-Man testing.

WxP Initial ces~ Second test Third test

A Satisfying activity Satisfying activity Frustrating activity


B Frustrating activity Frustrating activity Satisfying activity

NOTRS: Unless otherwise indicated, it is assumed that reliability coefficients were Pearson Product-l-foment
and were com-
puted from raw scores.
2 Total population; Mmale; Ffemale; N.R. nOt reported; IQintelligence quotient; MA-mental age.
Table 7. Studies reporting correlations between the Goodenough and Stanford-Binet

Number Correlations
. -

I
Investigator Year Subjects Age range
2 M F IQ MA

McElwee (524)----------------------- 1932 Retarded--------------------------- 14 years L5 ,-- .-. N.R. 0.72

Rohrs and Haworth (569)------------- 1962 Retarded-------------------------------------------- L6 23 22 0.28 N.R.


I (pMy
Familial -------------------------- 12.57 years 20 10 10 N.R. N.R.
(mean)
Organic--------------------------- 9.2 years 26 13 ~~ N.R. N.R.
(mean)

Birch (550)------------------------- 1949 Retarded---------------------------l 10-6 - 16-3 68 43 25 0.62 0.69

Israelite (562)--------------------- 1936 Feebleminded ----------------------- 6-3 - 40 years 256 .62 94 N.R. 0.71

Johnson, Ellerd, and Lahey (592)---- 1950 State hospital population---------- 6-9 - 17 years 209 ___ ,.- 0.48 N.R.

White (565)------------------------- 1945 -----------------------------------L--------------- 141 -- -. ,---- ,------


Feebleminded ----------------------- 8-0 - 19-4 47 -- -. 0.63
Epileptic -------------------------- 8-O - 19-4 47 -- -- 0.52 ;:$
Normal ----------------------------- 4-8 - 10-6 47 -. -- 0.71 N.R.

Havighurst and Janke (544)---------- 1944 Normals---------------------------- 10 years 114 -- -- 0.50 N.R.

Fouler (531)------------------------ 1953 Normala ---------------------------- 9-2 - 12-1 41 19 22 0.41 N.R.

Lessing (551)----------------------- 1961 Normals----------------------------l 8-9 years 23 21 2 0.51 N.R.


I
McHugh (549)------------------------ 1945 Normals---------------------------- 64 months 90 43 47 0.41 0.45
(mean)

Thompson and Finley (552)----------- 1963 Guidance clinic referrals---------- 5-9 years 164 81 83 0.67 N.R.
(:fi

Goodenough (504)-------------------- 1926 Norma is ---------------------------- 4-12 years 627 -- -- 0.74 0.76

Williams (505>---------------------- 1935 Normals---------------------------- 3-15 years 100 50 50 0.65 0.80


J

Designations of subjects are always white Americans unless otherwise specified.


NOTES: Unless otherwise indicated all correlations are Pearson Product-Moment , with the Stanford-Binet, Form L.
~ Total population; M-ma le; F-female; IQintelligence quotient; MA mental age; N.R. -rot reported.

scores largely reflect the ability to form non- support the conceptual interpretation stated, were
cepts is supported by the network of corre- the following:
lations compiled from a variety of tests and IQm Correlation
by studies such as that of McHugh (549), which
2 (legs present) -------- 0.48
analyzed Draw-A-Man items. McHugh computed 7a (eyes present) -------- 0.47
biserial correlations of Goodenough items with 9a (clothing present) ---- 0.40
the Stanford- Binet and reported positive corre- llb (leg joint shown) ----- 0.35
i2e (proportion, two di-
lations for 29 items; the remainder were zero or mansions) ------------ 0.54
slightly negative. The highest correlations, which 13 (heel shown) -------:-- 0.35

43
Table 8. Studies reporting correlations between the Goodenough and other measures

Number
Investigator Year Test or criterion variable Subjectaa Age range COrrelation
M F

Havighurst, Gunther, and 1946 Arthur Point Scale of Performance American 6-11 years 294 --- --- ----.-------
Pratt (558). Tests (IQ). Indians.

Zuni-------- 42 --- --- 0.10


Hopi-------- 78 --- --- 0.21
Navaho------ 47 --- --- 0.23
Sioux------- 53 --- --- 0.33
Papago------ .---------- . 74 --- --- 0.64
Albee and Hamlin (579)--- 1949 Clinical ratings of adjustments--- VA Mental N. R. N.R --- .-. 0.62
Hygiene [rank order)
Clinic.
Range nOr - 0,64
mals to psy- (pr:ducj
chotics .

Hav:~~rst and Janke 1944 C;m&l-COxe Performance Ability Normals ------ 10 years 114 --- --- 0.63

Havighurst, Gunther, and 1946 Cornell-Coke Performance Ability Normals ------ 6-11 years 66 Zt 3[ 0.63
Pratt (558). Scale.

Hinrichs (586)----------- 1935 Furfey Revised Scale for Measuring Delinquents-- 9-18 years 425 --- --- 0.35
Developmental Age in Boys.

Johnson (557)------------ 1953 Hoffman Bilingual Schedule-------- Spanish N.R. 30 --- --- 0.05
bilingual
(Us.).
Boehncke (546)----------- 1938 Letter International Performance Normals ------ 5-12 years 257 --- --- 0.83
Scale.

Ansbacher (553)---------- 1952 MacQuarrie Test for Mechanical Nsmmals------ 10 years 100 --- --- ..-----------
Ability.
Tracing------------------------- ------------ . -----------. --- --- --- 0.34
Tapping------------------------- --- --- --- 0.23
Dotting------------------------- ------------ . ----------- . --- --- --- 0.16
Brenner and Morse (517)-- 1956 Metropolitan Readiness Tests, Normals------ 4-7 - 5-11 16 7 5 0.5L?
Number Readiness (IQ). :rank order)

Hav:~~rst and Janke 1944 Revised Minnesota Paper Form NcI~als-----. 10 years 110 --- --- 0.48
Board Test, Form AR.

Brenner and Morse (517)-- 1956 Monroe Visual subtest (IQ)-------- Normals ------ 4-7 - 5-11 16 7 9 0.64
~rank order)

Hornowski (547)---------- 1961 Moray House Picture Intelligence Normals N.R. .R. .-. --- 0.34 (M)
Test. (Scotland). 0.49 (F)
Johnson (557)------------ 1953 Otis Self-Administering Tests of Spanish N.R. 30 --- --- -0.02
Mental Ability. bilingual
(Us.) .
Brenner and Morse (517)-- 1956 Picture Judgment of Maturity (IQ)-- Normals ------ 4-7 - 5-11 16 7 9 0.64
(rank order)

Pintner-Cunningham Primary Mental ------------ - ----------- --- --- --- 0.66


Test (MA). :rank order)

1932 Pintner Non-Language Primary 229 --- ---


m and
GOOdeOgh Mental Test (IQ).
Deaf--------- 5+ years 0.33

Norman and Midkiff (559)- 1955 Progressive Matrices -------------- Normals, 6-6 - 15-6 96 --- --- 0.24 (IQ)
American 0.35 (MA)
Indian.
Harris (548)------------- 1959 Progressive Matrices --------------- NOnnal s------ 5-1 - 6-1 98 45 53 0.22
Johnson (557)------------ 1953 Reaction time--------------------- Spanish N.R. 30 --- --- 0.43
bilingual
(Us.).
Brenner and Morse (517).- 1956 Sangren Information Mental Age---- VOnnals ------ 4-7 - 5-11 16 7 9 0.67
:rank order)
I

See footnotes at end of table.

44
Table 8. Studies reporting correlations between the Goodenough and other measuresCon.

Number
Investigator Yeax Test or criterion variable Correlation
M ??
.

Buhrer, de Navarro, and 1951 School grades-------------------- Normals, 7-14 years 1,9: --- --- .--------.--

.-----------.----------
Velasco (511). Spanish-

.-----------..----.----
speaking.
Mathematics---------------------
Language------------------------
Language and Mathematics--------
Drawing-------------------------
.----------------------
-------------
1------------
----
..-.
..-.
----
---
---
---
---
---
---
---
---
-0.04
-0.10
-:.;;

Fouler (531)------------- 1953 Social Distance Scale (Fowler)--- Nomls------l 9-2- 12-1 4: 19 22 0.40

Shirley and Goodenougt, 1932 StanfordAchievement, Education Deaf--------- 5+ years 41 --- --- 0.34
(575). (quotient).

Ansbacher (553)---------- 1952 38A Primary Me,ntalAbilities----- Normals------l 10years 10( --- ---
Word Vocabulary----------------- .------------ ------------ --- --- 0.23
Picture Vocabulary-------------- ------------- ------------ ---- --- --- 0.19
Total Verbal Meaning---------- ------------- ------------ ---- --- 0,26
Space--------------------------- ------------- ------------ ---- --- --- 0.38
Word Grouping-------------------- ------------- ------------ ---- --- --- 0.28
Figure Grouping----------------- ------------- ------------ ---- --- --- 0.34

,------------
------------
Total Reasoning--------------- ------------ ------------ ---- --- --- 0.40
Perception---------------------- ,------------ ------------ ---- --- --- 0.37

------------
1------------
Number--------------------------- ------------ \ ------------ ---- --- --- 0.24
Total Nonreading---------------- ---- --- 0.45
Total Score--------------------- ---- --- --- 0.41
S+R+P-------------------------- ---- --- --- 0.48

Harris (548)------------- 1959 W.A Primary Mental Abilities------ Normals ----- 5-1 - 6-1 9E 45 53 ------------

1
Verbal-------------------------- ,------------------------ ---- --- --- 0.50
Perception---------------------- ------------ ------------ ---- --- --- 0.44
Quantitative-------------------- .------------------------ ---- --- --- 0.54
Motor---------------------------- ,------------------------ ---- --- --- 0.40
Space---------------------------- ------------ ------------ ,--- --- ---, 0.51

Brenner and Morse (517)-- 1956 reacherrank of school readiness-. Normals ------ 4-7 - 5-11 16 7 9 0.69 (rho)

Britton (536)------------ .954 Jarner!sIndex of Status Charac- Normals------ 9-11 years 232 102 130 0.11
teristics.

Hanvik (593)------------- .953 USC Full Scale (IQ)--------------- Psychiatric 5-12 yeara 25 --- --- 0.18
patients. rank order)

Roars and Haworth (569)-- .962 Jechaler Intelligence Scale for Retarded, N.R. 46 23 23 ,-------.---
Children (IQ). familial ad
organic.
Verbal Scale--------------------- ------------ ------------ --- ---- ---- 0.28
Performance Scale---------------- -.---------- ------------- --- ---- ---- 0.53
Full Scale----------------------- ------------ ; ------------ --- ---- ---- 0.46

aDesignationsof subjects are always white Americans unless otherwise specified.


NOTSS: All correlation coefficients are Pearson Product-Mnmentunless otherwise specified.
z Total population; M-male; Ffemale; IQintelligence quotient; N.R.not reported; MA-mental age.

45
It is of interest that a careful survey of the Scale than with the Verbal Scale (excetx . . in
literature spanning a period of over 40 years Ellis two lowest grades).
fails to disclose any definitive pattern of the In comparing Draw-A-Man scores with WISC
particular components of mental maturity meas- Full Scale estimates, there is no reason to assume
ured by the Goodenough test. Harris believes any systematic differences in mean levels across
that this may be attributed to the fact that such the entire population. However, for statistical
components are themselves not clearly differ- estimation as well as analytic purposes, it is
entiated in young children. The correlational most appropriate to compute the regression of
results do, however, suggest strongly that the Draw-A-Man on Voc., BD, and Total Score and
Draw-A-Man is more highly associated with then to work with differences between regressed
factors measured by performance tests than with and actual scores for discrepancy y analysis,
verbal abilities. rather than with differences between scaled
In the Health Examination Survey, corre- scores.
lations of the Draw-A-Man with WISC and, more In view of the Draw-A-Mans sensitivity to L
particularly, with the short form composed of cultural variations, cases in which there are
WISC Vocabulary and Block Design would be most large discrepancies between the Draw. A-Man
relevant. Table 3 includes three reports (115, 130, and the WISC should be thoroughly evaluated in
and 224) which mention correlations between the the light of the WRAT scores and other infor-
Draw-A-Man Test and the Full Scale IQ of the mation from the Health Examination Survey.
WISC. Of these, none mentions correlations be- Although Harris summary and the reports con-
tween the Draw- A-Man and the short form sulted in this review have suggested a number of
of the WISC. Harris summary also cites the promising diagnostic score patterns, none of them
following unpublished data by Ellis. seem well enough established to be adopted.

THE HARRIS REVISION OF THE


Correlation
with: GOODENOUGH TEST
Age Number
Dale Harris 1963 publication (522), which
FS Vs Fs
he has named the Goodenough-Harris Drawing
Test, is a thorough revision and extension of
8 years ------- 0.70 0.77 0.67 Goodenoughs test. As already mentioned, it bases
9 years ------- 0.67 0.63 0.59
20 0.24 0.17 0.26 the lengthier point-score scales on both drawings
10 years -------
11 years ------- 17 0,50 0.45 0:46 of the male figure and drawings of the female
12 years ------- 19 0.62 0.50 0.68 figure, for which it provides separate norms for
13 years ------- 0.13 0.05 0.15 boys and for girls. A third picture, in which the
child draws a representation of himself, has not
been empirically standardized.
Disregarding the 13-year-old group, since it is Standardization of the Harris revision was
outside the effective range of the test as well completed on a total sample of 2,965 children,
as outside the age range of the Survey, Ellis representative of four major geographic areas of
results for the total sample of 106 have an the country. The sample was also representative
average correlation with the WISC Full Scale of the 1960 census distribution of fathers occupa-
IQ of 0.57. Again, this is higher than the corre- tions. Total point scores are converted to standard
lations reported by others. scores with a meau of 100 and a standard deviation
In summary, it appears that the WISC corre- of 15. Conceptually, these are equivalent to the
lations with the Draw-A-Man Test are substantial WISC deviation IQs. The new scales overlap
but lower than those of the Stanford- Binet. extensively with the original point scales, and
They are, however, higher with the Performance Harris found that children now earn substantially

46
higher scores when the 1963 norms, rather than necessary for the scorer to decide whether each
the 1926 ones, are utilized. The explanation for this drawing was of a Man or of a Woman.
phenomenon is not clear. The new norms do A sample of 200 drawings, 100 drawn by boys
appear to take into account technical and social and the other 100 drawn by girls, was taken at
changes which have occurred between 1926 and random from the Survey files. These drawings
1963. They also offer the advantages of greater were then carefully scored using Harris norms,
length (hence, higher reliability) and more ad- and the scores obtained were compared with the
equate provision for sex differences. scores the drawings had already received on the
1926 Goodenough scale. (Scoring by the 1926
Comparison of Goodenough and method is completed in the field by Survey staff
Goodenough-Harris Scores psychologists.)
Of the 200 cases, 195 were usable. Three
It seems desirable to inquire whether the drawings were rejected because they contained
Harris scales and norms could be used to score a face only, and for two cases age had been in-
human figure drawing obtained in the Health advertently omitted, precluding the computation
Examination Survey. As noted above, in this of standard scores. For the remaining drawings,
Survey only one picture is drawn by each child, neither scorer reported any difficulty in identi-
who is instructed, Make a picture of a person. fying the sex represented, and their agreement
Make the very best person that you can. To use on this was perfect.
the Harris scales in the Survey it would be

Table 9. Means of Goodenough-Harris and Goodenough variables and correlations between


scorers and between methods for total sample and six subsamples

Drawings of a Drawings of a
Total Draw- Draw- woman man
group ings of ings of
a woman a man By By By By
Variable boys girls boys girls

N=195 N=94 N=101 N=17 N=77 N=83 N=18

1. Goodenough-Harris
point (A)----------- 30.75 31.41 30.13 28.12 32.14 30.20 29,78
2. Goodenough-Harris
SS (A)-------------- 96.59 95.89 97.24 93.06 96.52 97.29 97.00
3. Goodenough-Harris
point (B)----------- 36.02 36.62 35.47 34.71 37.04 35.54 35.11
4. Goodenough-Harri.s
SS (B)-------------- 105.97 105.15 106.73 104.06 105.39 106.63 107.22
1+3 . Average Goodenough-
HarrLs point (A, B)-- 33.39 34.02 32.80 31.42 34.59 32.87 32.45
2+4 . Average Goodenough-
HarrLs SS (A, B)----- 101.28 100.52 101.99 98.56 100.96 101.96 102.11
Goodenough point ----- 26.38 25.57 27.14 24.29 25.86 27.20 26.83
2: Subjects CA--------- 115.01 111.89 117.92 118.35 110.47 118.10 117.11
Goodenough MA-------- 114.61 112.48 116.59 108.88 113.27 116.71 116.06
;: Goodenough IQ-------- 101.23 102.27 100.27 92,59 104.42 100.10 101.06

13----------------------- 0.90 0.89 0.91 0.82 0.91 0.90 0,95


24----------------------- 0.90 0.88 0.91 0,79 0,89 0.92 0.83
28----------------------- 0.78 0.76 0,81 0,60 0.78 0.87 0.47
48----------------------- 0.81 0.78 0,84 0,58 0,82 0,89 0.48

NOTE: N-number:Ascorer A: Bscorer MSSstandard score; CA-chronological age;


MA-mental age; r-correlation;

47
The usable sample of 195 cases consisted could be eliminated by further training of scorers.
of 100 boys and 95 girls. Of these, 17 boys drew Certainly these results illustrate the importance
a Woman figure and 18 girls drew a Man figure. of quality control of scoring. The averaging pro-
The remaining 82 percent of the total group cess is also highly recommended if systematic
(83 percent of the boys and 81 percent of the scorer differences cannot be eliminated.
girls) drew their own sex. The principal support, indicating an advantage
The following eight variables were recorded of the Goodenough-Harris scale, appears in the
for all 195 cases: comparison of mean scores for boys and girls on
Woman and Man drawings as abstracted in table
1. Harris method, point score, scorer A 10. In accordance with Harris own findings, girls
2. Harris method, standard score, scorer A score higher than boys, but the differences are
3. Harris method, point score, scorer B greater on the Goodenough scale than on the Good-
4. Harris method, standard score, scorer B enough-Harris scales and are greater on the
5. Goodenough point score Woman drawings than on the Man drawings. The
6. Subjects chronological age in months greatest discrepancy and resulting scoring pen-
7. Goodenough mental age alty by the Goodenough scale occurs in the case
8. Goodenough IQ of the 17 percent of boys (subsample 3) who
elected to draw a Woman. At the same time, the
Means, standard deviations, and intercorrelations 81 percent of girls (subsample 4) who elected to
were computed for the total sample and for the draw their own sex received disproportionately
following six subsamples: (1) Woman drawings high scores on the Goodenough, in comparison
(N=94), (2) Man drawings (N=101), (3) Woman with the mean levels on the Goodenough-Harris.
drawings by boys (N=17), (4) Woman drawings The Goodenough-Harris scores are higher than the
by girls (N=77), (5) Man drawings by boys (N=83), Goodenough for both sexes on the Man drawing.
and (6) Man drawings by girls (N=18). A summary The problems with the Woman drawing clearly
of the most relevant results, for all seven sample support the observation, first pointed c~ut by
combinations, appears in table 9. Goodenough and strongly reiterated by Harris,
The correlations between the two scorers that the female figure is more culture-bound
(r13 and r24) are high despite a systematic tend- than the male, is less stereotyped, and is more
susceptible to individual interpretation. Although
ency for scorer Bs results to exceed those of
the data on which the present analysis is based
scorer A (they average 5.25 above scorer A on
are limited, they do suggest that the Harris
point score and 9.38 higher on standard score).
revision does less violence to the female figure
As a more stable estimate of the Harris scores
than does the Goodenough scoring and that, in
for comparison with the Goodenough, average
general, the Harris revision is more adequate for
mean scores for the two scorers were computed.
opposite-sex drawings.
These appear in table 9 between variables 4 and 5.
These data, which indicate a superiority of
Although agreement between the two scorers
girls over boys in drawing scores, a tendency
is generally high, the lowest correlations were
for the Goodenough-Harris scores to be higher
found for the 17 boys who elected to draw a
than the Goodenough scores, and a tendency for
female figure (subsample 3). The standard score
girls who draw male figures to be older than girls
correlations for the 18 girls who elected to draw
who draw their own sex (while no such differ-
a male figure (subsample 6) are also com-
drawiqgs entiation occurs among boys), are all consistent
paratively low. These opposite- sex
with trends reported elsewhere in the literature.
also reflect the lowest correlations between
However, the most important argument in favor
Harris and Goodenough IQs for both scorers
of using the Goodenough-Harris scoring system
and r ) Thus scorer agreement is lowest
is that the variation of mean scores among the
!n2~pposit!~;ex drawings , and the results for four subsamples is thereby greatly reduced around
these show the poorest agreement, correlatior?- a mean of 100. This range is from 92.59 to 104.42
wise, between the Goodenough-Harris and Good- (11.83) on the Goodenough and from 98.56to 102.11
enough IQs. It is possible that these differences (3.55) on the Goodenough-Harris. Although the

48
Table 10. Comparison of Goodenough-Harris and Goodenough mean IQs for boys and girls
on same-sex and opposite-sex drawings

Drawing of a woman Drawing of a man


Sex
Goodenough Goodenough- Difference Goodenough Goodenough-
IQ Harris IQ IQ Harris IQ Difference

Boys ------- 92.59 98.56 +5.97 100.10 101.96 +1. 86


Girls ------ 104.42 100.96 -3.46 101.06 102.11 +1.05
Difference- 11.83 2.40 ---------- 0,96 -----------
0.15

Table 11. Coefficients of variation of Harris and Goodenough IQs for total sample and
six subsamples
II I I I
Drawings of Drawings of
Draw- Draw- a woman a man
Item Total ings ings
group of a of a
woman man By
b~~s g% bf~s girls

HarrLs standard score -------------- 0.16 0.15 0.15 0.10 0.16 0.17 0.13
Goodenough IQ---------------------- ().19 0.18 0.19 0.14 0.18 0.21 0.18

standard deviations of the Goodenough-Harris SUMMARY AND CONCLUSIONS


and Goodenough scores were not shown in table
9, the relative variability of scores based onthe The foregoingreview oftheDraw-A-Man Test
two systems is indicatedintable ll,whichreports supports the view that it is a reliable and valid
nonlanguage measure ofmentalmaturity ,although
coefficients ofvariation
(
standarddeviation
mean ) for
highly sensitive to cultural influences on the
childs conceptual representation of the human
Goodenough-Harris standardscores andforGood- figure. Its use in anational survey inthe 6to12
enough IQs for each of the subsamples. It is age range, inconjunctionwiththeWIScandlVRAT,
apparent that in every case variances lower for is logical and desirableparticularly as ameans
the Harris scores. of assessing intellectual development in cases in
which there is impairment of verbal development
Recommendation or verbal per,formance.
Personality assessm entbymeansof thematic
On the basis of this analysis it is recom- and qualitative assessmentof childrensdrawings
mended that the following steps be adopted in would probably be unrewarding. Some indications
relation to the Draw-A-Man Test in the Survey: justifying further research have been noted;how-
(l) the Goodenough-Harris systemshouldbeused; ever, such research is not sufficiently promising
(2) the entire sample should bescored centrally to warrant the expenditure of Survey funds. On
by uniform standards, with adequate trainingof the other hand, several lines of empirical work
scorers and quality control procedures routinely appear worthwhile. These are enumerated below.
followed; and (3) if scorer variations camotbe As discussedin the finalportion ofthereview
eliminated by training, theprocedureofaveraging of the Draw-A-Man, thereis strong evidence for
the results of two or more scorers should be the adoption of the Harrisrevision oftheDraw-A-
adopted. Man with central scoring by trained scorers, and

49
averaging of scores of two or more scorers, if 2. Regression studies of Draw-A-Man
scorer variations cannot be eliminated in train- scores with other psychometric variables
ing. This procedure need not be regarded as in the Survey so that comparisons can
expensive, since it could leave the field psychol- be made on the basis of differences be-
ogists free to test more children while the tween regressed and actual scores rather
scoring is done centrally by lower paid workers. than directly between raw scores.
Although research on personality-assess- 3. Further restandardization of the Good-
ment uses of the drawings within the Survey pro- enough-Harris norms on a national sample
gram is not recommended, the following lines of would be a valuable contribution to psycho-
empirical study and analysis are regarded as logical measurement of children that
useful and even important: could only reflect credit on the Survey
1. A systematic study of cultural variations and would be of major importance for
related to the principal geographic areas future use of this well-established and
in which Survey data were collected to useful intelligence test. This significant
evaluate the effects of factors such as undertaking, if approved, should include a
customs, attitudes, dress, art, and social complete item analysis as well as recom-
roles in relation to the items in the point putation of norms.
scales by which the Draw-A-Man is scored.
Some additional suggestions regarding cross-
Even if the results of such an analytic
disciplinary studies with reference to the Draw-A-
study should be negative, they would be
Man Test are presented in a later section of this
very reassuring in relation to the use of
report.
the Draw-A-Man scores in the Survey.

BIBLIOGRAPHY

General References to Draw-.4-Man 511. 13uhrer, L., de Navarro, R., and Velasco, E. S.: Ensayo
de tipi ficacion de la prueba mental Dibujo de un hom-
501 Herrick, h!. A.: Childrens drawings. Ped.Sen. 3:338- bre de F. Goodenough. PuH.lnst.Biotipol. Exp.U.Cuyo,
339, 1893. 2:113, 1951.
502. Barnee, E.: A study of childrens drawings. Ped.Sem,
2:455-463, 1893. 512. Weider, A., and NoHer, F. .4.: Objective etudies of chil-
drens drawings of human figures, II, Sex, age, inteHi-
503. Lukens, H.: A study of childrens drawings in the early
gence. J. Clin.Psychot. 9:20-23, 1953.
years. Ped.Sem. 4:79-110, 1896.
504. Goodenough, F. L.: A new approach to the measurement 513. Tusk a, S. A.: Developmental Concepts With the Dratu-a-
of the intelligence of young children. J. Genet. Psychol. Person Test at Different Grade Levets. Unpublished
33:185-211.1926. maeters thesis, Ohio University, 1953.

505. Williams, J. H.: Validity and reliability of the Goodenough 514. Stewart, N.: Review of Goodenough Draw-A-Man Teet, in
intelligence test. Sch. &Soc. 41:653-656, 1935. O.K. Buros, ed., The Fourth Mental Measurements Year-
book. Highland Park, N.J. The Gryphon Press, 1953.
506. Smith, F. 0.: \Vhat the Goodenough intelligence test
515. Woods, W. A., and Cook, W. E.: Proficiency in drawing
measures. Psychological Bull. 34:760-761, 1937.
and placement of hands in drawings of the human figure.
507. Earnhart, F. N.: Developmental stages in compositional
J. Consult. Psycho2. 18:119-121, 1954.
construction in childrens dra~vings. J. E.rp. Educ. 11:156-
516. Bliss, M., and Berger, A.: Measurement of mental age as
184, 1942.
indicated by the male figure drawings of the mentally
508, hlcHugh, G.: Changes in Goodenough IQ at lhe public subnormal ueing Goodenough and Machover instructions.
school kindergarten level. J. Educ.Psychol. 36:17-:30, Am. J. Ment.Deficiency 59:73-79, 1954.
1945. 517. !3renner, A., and Morse, N. C.: The measurement of chil-
509. Lehner, G. F. J., and Silver, H.: Some relations bet~veen drens reaclinese for school. Pap..Mich.Acad. Sci. 41:
own age and ages assigned on the Dra\v-a-Person Test; 3:33-340, 1956.
abet.ratted, Am.Psychologist 3:341, 1948. 518. Frankiel, R. V.: A Quality Scale for the Goodenough
510. Goodenough, F. L., and Harris, D. 33.: Studies in the Draw-a-Man Test. Unpublished masters thesis, Univer-
psychology of childrens drawings, II, 1928-1949. Psy - sity of Minnesota, 1957.
rholo!yiral Bull. 47:369-433. 1950.

50
519. Zuk, G. H.: Childrens spontaneous object elaborations Goodenougb: Body Image, Sexual Identi Fication
on a visual-motor test. J. Clin.Psychot. 16:280-283,
1960. 537. Weider, A., and Noller, P. A.: Objective studies of chil-
drens drawings of human figures. I, Sex awareness and
520. Stoltz, R. E., and Coltharp, F. C.: Clinical judgments
socioeconomic level. J. Clin.PsychoZ. 6:319-325, 1950.
and the Draw-A-Person Test. J. Consult .Psychol. 25:
43-45, 1961. 538. Knopf, I. J., and Richards, T. W.: The childs differen-
tiation of sex as reflected in drawings of the human fig-
521. Zuk, G. H.: Relation of mental age to size of figure on
ure. J. Genet.Psycho/. 81:99-112, 1952.
the Draw-A-Person Test. Percept. Mot.Skilts 14:410,
1962. 539. Swenson, C. H., and Newton, K. R.: The development of
522. Harris, D. B.: Childrens Drawings as Measures of Intel- sexual differentiation on the Draw-a-Person Test. J.
lectual Maturity. New York. Harcourt, Brace & World, Clin.Psychol. 11:417-419, 1955.
Inc., 1963. 540. Lakin, M.: Certain formal characteristics of humah fig-
ure drawings by institutionalized aged and by normal
Goodenough: Reliability Studies children. J. Consult. Psychol. 20:471-474, 1956.
523. Yepsen, L. N.: The reliability of the Goodenough draw-
ing test with feebleminded subjects, J. Educ.Psychol. 541. Brown, B. G., and Tolor, A.: Human figure drawings as
20:448-451, 1929. indicators of sexual identification and inversion. Per-
524. McElwee, E. W.: The reliability of the Goodenough in- cept..lfot. Skills 7:199-211, 1957.
telligence test used with sub-normal children fourteen 542. Fisher, G. M.: Sexual identification in mentally subnor-
years of age. J. Appl.Psychol. 16:217-218, 1932. mal females. Am.J.Ment.Deficiency 66:266-269, 1961.
525. Brill, hf.: The reliability of the Goodenough Draw-a-Man 543. Silverstein. A. B.. and Robinson, H. A.: The rermesen-
Test and the validity and reliability of an abbreviated tation of physique in childrens figure drawing. J.Con-
scoring method. J. Educ.PsychoL 26:701-708, 1935. suit.PsychoL !25:146-148, 1961.
596. McCa~thy, D.: A study of the reliability of the Good- Goodenough: Relation @ Other Tests
enough Drawing Test of Intelligence. J. Psychol. 18:
201-206, 1944. 544. Havighurst, R. J., and Janke, L. L.: Relations between
597. McCurdy, H. G.: Group and individual variability on the ability and social status in a midwestem community.
Goodenough Draw-A-hIan Test. J. Educ.Psychol. 38:428- I, Ten-year-old children. J. Educ.PsychoZ. 35:357-368,
436, 1947. 1944.
528. Harris, D. B.: Intra-individual vs. inter-individual con- 545. Condell, J. F.: Note on the use of the Ammons FulI-
sistency in childrens drawings of a man; abstracted, Range Picture Vocabulary Test with retarded children.
Arn.Psychologist 5:293, 1950. PsychoZ.Rep. 5:150, 1959.
546. Boehncke, C. F.: A Comparative Study of the Goodenough
Goodenouglu Factors Affecting Drawing Productions Drawing Test and the Leiter International Performance
529, McHugh, A. F.: The Effect of Preceding Affective States Scale. Unpublished masters thesis, University of South-
ern California, 1938.
on the Goodenough Draw-A-Man Test of lnte.?ligence.
Unpublished masters thesis, Fordham University, 1952. 547. Hornowski, B.: Interpretation psychologique des diffe-
rences entre sexes clans Ie dessin due bonhomme chez
530< Rei chenberg-Hackett, W.: Changes in Goodenough Draw-
les jeunes adolescents (Psychological interpretation of
ings after a gratifying experience. Am .J.Orthopsychiat.
sex differences in the Draw: a-hfan Test among young
23:501-517, 1953.
adolescents). Revue Psychol.AppZ. 11:7-9, 1961.
531. Fowler, R. D.: The Relationship of Social Acceptance
548. Harris, D. B.: A note on some ability correlates of the
to Discrepancies Between the IQ Scores on the Stanford-
Raven Progressive Matrices (1947) in the kindergarten.
Bifiet Intelligence Scale and the Goodenough Drau~-a-Man
J. Educ.Psychot. 50:2$28-229, 1959.
Test. Unpublished masters thesis, University of Ala-
bama, 1953. 549. hicHugh, G.: Relationship between the Goodenough Draw-
532. Hermn, W.G.: The Effect of Preceding Affective States a-Man Test and the 1937 revision of the Stanford-Binet
on the Goodenough Drawing Test of Intelligence. Un- Test. J. Educ.Psychol. 36:119-124, 1945.
published masters thesis, Fordhsm University, 1957. 550. Birch, J. W.: The Goodenough Drawing Test and older
mentally retarded children. Am. J. Ment.Deficiency 54:
533. Koppitz, E. M.: Teachers. attitude and childrens per-
formance on the Bender-Gestalt Test and Human Figure 218-224, 1949.
Drawings. J. Clin.PsychoZ. 16:204-208, 1960. 551. Lessing, E. E.: A note on the significance of discrep-
534. Rlchey, M. IL, and Spotts, J. V.: The relationship of ancies between Goodenough and Binet IQ scores. J.
popularity tn performance on the Goodenough Draw-.4- Consult. Psychol. 25:456-457, 1961.
Man Test. J. Consult. PgychoZ. 23:147-150, 1959. 552. Thompson, J. M., and Finley, C. J.: The relationship
535. Tolor, A.: Teachers judgments of the popularit~of chil- between the Goodenough Draw-a-Man Test and the Stan-
dren fmm their human figure drawings. J. Clin.PsychoL ford-13inet Form L-M in children referred for school guid-
11:158-162, 1955. ance services. CaZif.J.Educ.Res. 14:19-22, 1963.
553. !nsbacher, H. L.: The Goodenough Draw-A.hian Test
536. Britton, J. H.: Influence of social class upon performance and primary mental abilities. J. ConsuZt.PsychoZ. 16:
on the Draw-A-Man Test. J. Educ.Psychol. 45:44-51, 1954. 176-18C, 1959.

51
Goodenough: Cultural Variations, Bilingualism Goodenough: Chronic Encephalitis

554. Hunkin, V.: Validation of the Goodenough Draw-a-Man 570. Bender, L.: The Goodenough Test (Drawing a Man) in
Test for African children. J. Soc.Res.Pretoria 1:52-63, chronic encephalitis in children. J. Chitd Psychiat. 3:
1950. 449-459, 1951.
555. Dennis, W.: Performance of Near Eastern children on the
Draw-a-Man Test. Child Development 28:427-430.1957. Goodenough: Physically Handicapped
556. Anastasi, A., and DeJesus, C.: Language development 571. Martorana, A. A.: A Comparison of the Psrsonal, Emo-
and nonverbal IQ of Iuerto Ri can preschool children in tional, and Family Adjustment of Crippled and Normal
New York City. J. Abnorm. &Social Psychol. 48:357-366, Children. Unpublished doctoral dissertation, University
1953. of Minnesota, 1954.
557. Johnson, G. B., Jr.: Bilingualism as measured by a re- 572. Sllverstein, A. B., and Robinson, H. A.: The represen-
action-time technique and the relationship between a tation of orthopedic disability in childrens figure draw-
language and a non-language intelligence quotient. J. ings. J. Consu2t.Psychol. 20:333-341, 1956.
Genet.Psychol. 82:3-9, 1953.
573. Johnson, O. G., and Wawrzasek, F.: Peychologiets judg-
ments of physical handicap from H-T-P drawinge. J, Con-
558. Havighurst, R. J., Gunther, M. X., and Pratt, I. E.: En-
suit.Psychol. 25:284-287, 1961.
vironment and the Draw-A-Man Test, the performance of
Indian children. J. Abnorm.&Social Psychol. 41:50-63, Goodenou~h: Intelligence of Deaf Children
1946. 574. Peterson, E. G., and Williams, J. M.: Intelligence of deaf
559. Norman, R. D., and h4idki ff, K. L.: Navaho children on children ae meaeured by drawings. Am. Ann.Deaf 75:273-
Raven Progressive Matrices and Goodenough Draw-.4- 290, 1930.
Man Teets. SWest.J.Anthrop. 11:129-136, 1955. 575. Shirley, M., and Goodenough, F. L.: A survey of the in-
560. C~rney, R. E., and Trowbridze, N.: Intelligence test per- telligence of deaf children in Minnesota schools. Am.
forma~ce of Indian children ;S a functionof type of ~est Ann.Deaf 77:238-247, 1932.
and age. Percept.Mot. Skills 14:511-514, 1962. 576. Springer, N. N.: A comparative study of the intelligence
of deaf and hearing children. Arn.Ann.Deaf 83:13 ~-152,
Goodenough: With Subnormal, Retarded, and
1938.
Mentally Defective Children
Goodenough: Measurement of Adjustment
561. McElwee, F.. W.: Profile drawings of normal and subnor-
mal children. J. App Z.Psycho Z. 18:599-603, 1934. 577. Brill, M.: A study of instability using the Goodenough
562< Israelite, J.: A comparison of the difficulty of items for drawing scale. J. Abnorm. fiSocial Psychol. 32:288-302,
intellectually normal children and mental defective on 1937.
the Goodenough drawing test. Am. J. On$hopsychiat. 6: 578. Springer, N. N.: A study of drawings of maIadjueted and
494-503, 1936. adjusted children. J. Genet.Psycho2. 58:131-138, 1941.
563. Spoerl, D. T.: Personality and drawing in retarded chil- 579. Albee, G. W., and Hamlin, R. M.: An investigation of the
dren. Character. Pe7s. 8:227-239, 1940. reliability and validity of judgments of adjustment in-
564. Spoerl, D. T.: The drawing ability of mentally retarded ferred from drawings. J. C2in.Psychot. 5:389-392, 1949.
children. J. Genet. Psychol. 57:259-277, 1940. 580. Ochs, E.: Changes in Goodenough drawings associated
565. White, M. R.: The Performance of Epileptic, Feeble- with changee in social adjuetnrent. J. Clin.Psycho2. 3:
minded and Normal Children on the Goode nough Test of 282-284, 1950.
lnte J?igence. Unpublished masters thesie, State Univer- 581. Albee, G. W., and Hamlin, R. M.: Judgment of adjustment
sity of Iowa, 1945. from drawings; the applicability of rating scale methode.
566. Gunzburg, H. C.: The significance of various aspects in J. C?in.Psycho?, 6:363-365, 1950.
drawings by educationally subnormal children. J.kfent.. 582. Stone, P. M.: A Study of Objectively Scored Drawings of
SC 96:951-975, 1950. Human Figures in Relation to the Emotional Adjustment
567. Fabian, A. A.: Clinical and experimental studies of of 6th (%ade Pupils, Unpublished doctoral dissertation,
school children who are retarded in reading. Quart.J. Yeshiva University, 1952.
Child Behav. 3:15-18, 1951. 583. Palmer, H. R.: The Relationship of Differences Between
568. Hunt, B., and Patterson, R, M.: Performance of familial Stan ford-Binet and Goodenough [Qs to Personal Adjrkst-
mental lydeficient children in response to motivation on ment as indicated by the California Test of Personality.
the Goodenough Draw-A-Man Test. Am .J.Ment. Deficien- Unpublished masters thesie, University of Alabama,
C7J 62:326-329, 1957. 1953.
569. Rohrs, F. W., and Haworth, M. R.: The 1960 Stanford- 584. Popplestone, J. A.: Male Human Figure Drawing in Nor-
Bi net, WISC, and Goodenough Tests with mentally re- mal and Emotionally Disturbed Children. Unpublished
tarded children. Am. J. Ment.Deficiency 66:853-859, 1962. doctoral dissertation, Washing@n University, 1958.

52
585.Feldmau,M.J.. mrl Hunt,R. G.: The relation of diffi- 591. Holzberg, J. D., and WexZer, M.: The validity of human
cultyindrawingtnratings of adjustmentbased on human form drawings as a measure of personality deviation. J.
figure drawings. J. Consu~t.P8ycho~. 22:217-.219,1958. Project. Tech. 14:343-361, 1950.
GoodenouEh:WithDelinquents 592. Johnson, A. P., Ellerd, A. A., and Laheyj T. H.: The
Goodenough Test as an aid to interpretation of chil-
586. Hinrichs,W.E.: TheCioodenough drawingtest in relation drens school behavior. Am. J. Ment.Deficiency 54516-
to delinquency and problembehavior. Archs.Psychol., 520, 1950.
N.Y. No. 175, 1935.
593. Hanvik, L. J.: The Goodenowzh Test as a measure of in-
587. Starke,1?.:An Attempt To Differentiate Delinquents From telligence in child psychiatric; patients. J. CZin.PaychoZ.
Non-delinquents by Tests of Dominance .Behavior, Dom- 9:71-72, 1953.
inance Feeling and the Goodenough Drawing of a Man.
Unpublishedmasters thesis, Universityof Minnesota. Goodenough: Other References Cited in Text
1950. 594. Buck, J. N.: The H-T-P technique, a qualitative and
quantitative scoring manual. J. CZin.PsychoZ. 4:317-396,
Goodenough:WithDisturbed!?ersons
1948.
588. 13errien,F. K.: A studyof thedrawingsof abnormalchil 595. Goodenough, F.: Measurement of Intelligence by Draw-
dren. J. Educ.PsychoZ. 26:143-150, 1935. ings. New York. Harcourt, Brace and World, Inc., 1926.
589. Despert,J. L.: Emotional Problems in Children. Utica.
596. Machover, K.: Pereanality Projection in the Drawing of
State Hospitals Press, 1938. the Human Figure. Springfield, Ill. Charles C. Thomas,
590. Des Lauriers, A., and Halpern,F.: Psychologicaltests 1949.
in childhoodschizophrenia. Am.J.Orthopsychiat. 17:57-
67, 1947.

IV. THE THEMATIC APPERCEPTION TEST

The technology of personality measurement a total life record. The clinician usually feels
lags far behind that of ability and achievement free to accept or disregard information in this
measurement. This lag makes it difficult for frame of reference, and he often employs informal,
organizations (such as the Division of Health unstandardized tests as well as published pro-
Examination Statistics) which seek to estimate cedures without regard for formal considerations
population parameters on the basis of definitive of reliability and validity. Furthermore, since
test scores. At present there is not a single per- clinical jud~ents are confined to individual
sonality test for children that could be recom- cases, they are not subject to verification by the
mended without qualification. In view of the rules of evidence observed in scientific studies.
extensive use of personality tests in clinical Educators often justify their personality testing
practices and in school situations, this sweeping as contributing to research, which is important,
statement may appear extreme. It is, neverthe- and the only tenable position in the light of the
less, regrettably true. Perhaps clinical psychol- facts.
ogists can justify their use of various personality In contrast with the clinical and research uses
measures on the basis of intensive individual case of personality measures, where legitimacy is not
study in which test responses and scores are in- primarily a function of the proven adequacy of the
terpreted, by the clinician, in relation to con- measurement instruments employed, surveys such
sistent patterns of performance in the context of as this one (HES) operate under severe constraints.

53
The survey scientist must defend the validity and How these various factors combine are only
reliability of his instruments as well as the ade- imperfectly understood in the scientific study
quacy of his sampling design for the purposes of of perception; they have not, to the writers
his survey; both considerations affect the validity knowledge, been investigated in relation to the
of population estimates from sample data. TAT pictures. In spite of these facts, for the past
The choice of a personality measurement 60 or more years users of projective techniques
instrument for Cycle H must be considered in the have continued to assume that responses to
context of the preceding discussion. Although the various stimuli represent projection only.
California Personality Test and Cattells Junior Cattell (796) has suggested that projective
Personality Quiz are, in the opinion of the writer, tests (which he thinks should be called misper-
the most adequately documented of the currently ception tests), should employ stimuli of a much
published and objectively scored personality tests lower order of complexity than those of the TAT
for children, neither meets the reliability and and the Rorschach inkblots in order to simplify
validity standards necessary for Survey use and interpretation. Technically this may be an im-
neither is appropriate for the entire age range of provement, as Cattell has shown in the misper-
6 through 11 years. Apart from these, no available ception tests which he designed for his objective
tests even approach the requirements of this test batteries. In these tests the subjects latitude
Survey. of response to a specific ambiguity (e.g., esti-
In the psychometric sense, the Thematic mating the number of communist party members
Apperception Test (TAT) is not a test. It is a in the United States or the value of a college
projective device consisting of a series of am- degree) is extremely limited. A similar con-
biguous (unstructured) pictures individually pre- clusion is also implicit in the modifications of
sented to the subject (or patient), who is asked to the TAT picturles made by McClelland (798) in
imagine and relate a story. The rationale of the his studies of motivation measurement in fantasy.
procedure is that people will seek to create In a complex projective technique such as
structure when a stimulus situation is unstruc- the TAT, the story produced by a subject may
tured and that in doing so they will draw on their represent his response to the entire picture or
own experience, needs, attitudes, and values to only to certain parts of the stimulus picture. In
provide the details. ~is process is viewed as a addition, the story itself necessarily requires
projection of inner processes on the un- technical interpretation by the examiner to the
structured stimulus. extent that it employs idiosyncratic language,
The TAT was developed by Henry A. Murray symbols, and ideation. Because of the freedom
of Harvard University in 1938 (788). At the same and informality y of the method, which is deliberate
time he presented a report which outlined a (in order to avoid prompting or the addition of
motivational system of organismic needs and en- extraneous variance contributed by the examiner),
vironmental presses, This report was highly in- it is virtually impossible to relate responses to
fluential and stimulated much research. Five specific internal and external cues or patterns of
years later (in 1943), the TAT pictures and a cues.
manual for their use were published (799). The very looseness of the interpretative
From the objective scoring standpoint, it is procedure, in contrast to fixed scoring keys in
necessary to recognize that all proj ective methods the case of questionnaires (usually answered
share a major problem, since in all of them the yes, no, or ?), led George Kelly (797), in
testing strategy depends on the process by which an Anwal Review article, to observe that while
subjects add structure to ambiguous stimuli. in the case of questionnaires the subject tries to
Although this structuring process does involve guess what the examiner is thinking, in projective
projection, in the sense defined above, it also techniques the examiner must guess what the
simultaneously involves other factors. Indeed, subject is thinking. In either case, there is a good
the structuring process may be as much a deal of guessing going on,
function of external, situational factors, to which The TAT has some similarity to the Draw-
the subject is responding, as of internal factors. A-Man Test in that the Draw-A-Man provides an

54
unstructured stimulus (the instruction to draw a and studies of motivation. The items selected for
person) and permits wide latitude of response inclusion in this report were judged relevant if
structuring on the part of the subject. It is note- they (1) used a measurement approach, (2) were
worthy that the Draw-A-Man has produced no validation or normative studies, (3) had an appli-
acceptable schemes for personality interpreta- cable sample in terms of age, or (4) used an
tion. However, as pointed out in the discussion of adequate scoring procedure.
the Draw-A-Man, the most promising results in
personality, as well as in cognitive assessment, Overview
have been those employing detailed, objective
techniques of scoring, such as the point scales. Treatment of the TAT by different writers
The selection of five cards of the TAT for ranges from uncritical acceptance on the basis
the Survey undoubtedly reflects (1) the appraisal of a priori assumptions, illustrated by Henry (749)
of existing personality tests mentioned almve, and Piotrowski (702), through qualified acceptance
combined with (2) the recognition of apparent with a soft attitude toward the contradictory
widespread acceptance of the TAT as a pro- evidence, as demonstrated by Mayman (701) and
jective technique and (3) the belief that an Lindzey (703), to objective evaluation, illustrated
appropriate method of objective scoring of re- by Eron (706), Windle (704), and others. Windles
sponses to them can be developed for the specific comment, that there is little agreement among
use of the Survey as well as for later more results reported by different investigators, seems
general use by professional workers. The basis to describe accurately this field of research. One
for this appraisal cannot be documented here, area in which some agreement may be found,
although the writer is prepared to defend it. however, is that of cognitive evaluation (714 and
Reference to the forthcoming Sixth Menta.Z Meas- 737-739); this is highly reminiscent of the Draw-
urements Yeavbook (O. Buros, cd., New Bruns- A-Man.
wick, N. J., The Gryphon Press) might be suffi- we TAT literature abounds in elaborate but
cient for this purpose. The evidence for the largely untested (critically, that is) scoring
recognition of acceptance of the TAT is discussed systems. Most of these are too extensive for brief
below, together with an evaluation of the prospects summarization and go beyond the purposes of this
for successful development of an objective scoring report. However, they have been reviewed in
procedure. anticipation of a further empirical study of the
Surveys Thematic Apperception Test data, and
REVIEW OF THE LITERATURE references to 21 additional selected reports are
included in the bibliography of section IV.
ON THE TAT Most of these, as well as a number of other
suggested analytic methods of scoring the TAT,
The present review includes abstracts of pub-
are well summarized in a 1951 publication by
lished research articles, theses, and critical
Edwin S. Shneidman, Walther JoeI, and Kemeth B.
reviews of the TAT literature, as well as 5 general
Little (800). Although the modes of analysis vary
references on the thematic apperception method.
in detail and in terminology, the typical one in-
These constitute only a small portion of the ex-
volves interpretation and frequency counting or
tensive psychological, anthropological, and socio-
evaluation on a rating scale of all or part of the
logical research on the TAT and its variants which
following types of information, usually across all
have appeared in undiminished quantity over the
of the stories obtained for a selection of cards.
years (e.g., Thompsons Negro edition of the TAT,
(The full series of cards is often abridged because
Symonds Picture Story Test, Bellaks Childrens
of practical time limitations, as it is in the
Apperception Test (CAT), Van Lenneps Four
Survey.)
Picture Test, Phillipsons Object Relations Tech-
nique, and numerous other techniques which can
Formal (structural) aspects of the stories
be traced to the Murray version). Both the TAT
procedure and the Murray need-press concepts Compliance with instructions (including card
have been used extensively in personality studies rejection)

55
Consistency of stories In spite of the logical (from some theoretical
positions) appeal of these analytic approaches,
Length of stories; vocabulary level
they do not fit the requirements of psychometric
Grammatical forms (nouns, pronouns, verbs, procedures. Such analytic approaches satisfy the
incomplete sentences) needs of various clinicians or investigators in
their individual practices and researches, but for
Number and type of situations described
survey purposes they are useful primarily because
Number and type of characters included they suggest areas which may be suitable for
objective study. With the exception of some formal
Outcome of stories
characteristics (such as length of story and other
Level of response (from description to im- items that can be counted fairly accurately) which
aginative interpretation) have been related to developmental rather than
personality-adjustment concepts, there is so little
Interpretive categories agreement in the literature on most scoring cate-
Feelings, moods, worries, emotional tone gories that an investigator seeking to develop an
objective scoring procedure might as well start
Needs expressed (or implied) from scratch.
Conflict areas
Research Demonstratirrg
Pressesphysical, emotional, mental, eco- Developmental Factors
nomic, social, religious
Edelstein (737) completed an interesting pilot
Charactersstrivings, attitudes, obstacles, study demonstrating a system for scoring TAT
barriers, traits, and roles of hero, major stories. From her system a total age-adjusted
characters, and minor characters score, correlating well with Stanford-Binet IQs,
could be derived. She used the following six
Outcomes reflecting success, failure scoring categoriesnumber of words, qualifier/
Thematic contentfamily dynamics, inner word ratio, number of conditions, number of
adjustment, sexual adjustment, interpersonal responses, number of situations involved, and
relations, aggression (physical, nonphysical) number of characters. Her sample included only
15 boys and 13 girls (ages 9-5 to 12-5), but from
Developmental level in Freudian (psycho- a methodological viewpoint her study is promising.
sexual) context In a conceptually related study, Armstrong
Defense mechanisms utilized (714) administered the CAT (cards 1, 2, 4, 8, and
10) to a sample of 60 children in grades 1 to 3 in
Manner in which environment is assimilated the University of Minnesota elementary school.
The number of variables enumerated under The findings of her study relevant to the present
these categories is extensive (Murrays need- review are as follows: (1) length of story in-
press system alone exceeds 83), and in most creases with grade, (2) girls protocols are
cases the variables require detailed, careful longer than those of boys, (3) the use of first
definition and intensive training of scorers. High person pronouns shows a slight but consistent
reliabilities have often been achieved among decline with grade progression, (4) girls tend
scorers within a particular laboratory for a given to make more subjective and personalized state-
period of tenure of the staff members involved, ments than boys, and (5) girls have a consistently
but these have not generally been maintained with longer reaction time than boys.
staff changes or when systems have been tried Slack (761) gave the TAT to 15 exogenous
out at other institutions. Often, definitions change feebleminded boys and 12 endogenous ones at the
over time as new generations of protocols appear, Vineland Training School. He correlated a score
requiring decisions in relation to categories reflecting the numbev of causally and purpose-
developed on the basis of earlier samples. fully connected statements with the Stanford- Binet

56
and with. Thurstones test of Primary Mental the TAT responses. This is true of all psycho-
Abilities (PMA). With chronological age held logical tests. It is not possible to say whether
constant, causally or purposefully connected this is a greater probIem on the TAT than on the
statements correlated with other variables as WISC, for example, but it must be kept in mind
follows: S-B MA, 0.58; PMA MA, 0.70; PMA as a significant source of uncontrolled variation.
Verbal MA, 0.51; PMA Motor MA, 0.72. Length of Gurevitz and Klapper (763) found that schizo-
stories (number of words) correlated as follows phrenic children characteristically respond to
with the same variables (CA held constant): S-B CAT cards with bizarre outcomes, evaluation of
MA, 0.31 (ns); PMA MA, 0.34 (ns); PMA Verbal stimuli, use of titles, hostility, and verbosity.
MA, 0.53; PMA Motor MA, 0.48. The age-cor- Holden (766) compared a small sample of cerebral
rected correlation of number of purpose@l ye- palsied children with normal controls. His results
lations with the PMA Verbal MA was 0.90, and the clearly suggest that cerebral palsied respondents
correlation of numbey of causal Yelations with tend to describe the cards, while normal controls
the same measures was 0.42. Slack also reported give more thematic content. The average number
a significant difference between the eqdogenous of descriptions (out of 10 cards) was 6.0 for the
and exogenous groups on length of stories. palsied children and 2.8 for the controls. Leitch
These studies lend some limited support to and Schafer (770) reported a number of response
the possibility of developing an objective scoring criteria identifying psychotic responses.
system based on developmental criteria for the From the standpoint of further research on
five TAT pictures used in the Survey. the development of a scoring procedure for the
TAT, the following list of specific items has been
Other Relevant Research recorded and evaluated in one or more of the
studies reviewed (reference numbers shown in
The following studies were selected for cita- parentheses). In most cases the results were not
tion on the basis of their relevance to the Survey included in the main discussion either because of
problems. Lesser (720) demonstrated how a sample limitations, subjective methods of scoring,
Guttman-type scale could be developed for inconclusiveness of results, or unrelatedness to
measurement of aggressive fantasy. Bijou and the present problem. Many of them, however, do
Kenny (732) and Murstein (734) investigated appear definable and worthy of further study.
ambiguity values of TAT cards. The former found
the following ambiguity ranks (out of 21) for the Frequency and duration
four picture cards used in the Survey (card 16, RT latency (705 and 747)
blank, was not rated): Total reaction time (705 and 747)
Number of words (707, 714, 737, 741, 746,
747, and 764)
Card numbw Rank Number of adjectives (737)
Number of adverbs (737)
1 ---------- ------- ------ ------ 2 Number of nouns (714)
2 -------.-- ------------------- 3 Number of pronouns (714)
5----------------------------- 17 Number of verbs (714)
IBM -------------------------- 11 Number of questions (705)
Number of ego words (714)
Number of situations (737)
The latter reported that cards with medium Number of characters (707 and 737)
ambiguity (8BM) were most productive of the- Male, female
matic content among college students. Nature of action
Milam (735) demonstrated the sensitivity of Crying (718)
TAT responses to examiner influence. Apparently, Dancing (737)
the attitudes and behavior of the examiner, as Disaster (713)
perceived by the subject, account for variance in Drunkenness (737)

57
Escape solutions (705 and 718) Repetitions (770)
Fear of punishment (742) Foreign expressions
Fighting (720) Relative age of characters (705)
Hardship (713) Older
Illness (713) Peer
Loss of ability, skill, Younger
money (737) Sex role identification (705)
Suicide (705) own
Frightening (737) Opposite
Killing (720) Ambiguous
Ridiculing (720) Tone of story (712)
Making fun of (737) Emotional
Punishment (705 and 743) Submission to fate
Stealing (737) Rebellion
Receiving aid (705) Fear
Giving aid (705) Worry
Teaching (737) Lack of affect
Laughing (737) Aspiration
Singing (737) Shitt of tone
Book or movie cited as source (705) Theme of story
Criticism of picture (705) Unrelated (770)
Liked, disliked (705) Curiosity (738)
Title (763) Scorning (720)
Number of themes (707, 712, and 764) Social approval (713)
Card description Positive
Parts referred to (705) Negative
Number of rare picture details (705) Evasive
Compliance with instructions (705, 707, Stressful (725)
and 721 ) Ordinary family activity (712)
Examiner included in story (770) Mental inadequacy (713)
Response Motivational inadequacy (713)
Bizarre (705 and 763) Physical inadequacy (713)
Queer (770) Perceptual distortions (705, 712, and 770)
Contradictory (770) Neatness or orderliness of story (705)
Incoherent (705 and 770) Overspecific statements (770)
Transcendental (707 and 714) Overgeneralizations (770)
Number of references Autistic logic (770)
Future events (705 and 721) Feelings
Past events (705 and 721) Anger toward parent(s) (743)
Present events (705 and 721) Aesthetic (705)
Level (712, 721, 755, 766, and 776) Ambivalent (705)
Enumerative Benign (705)
Descriptive Conflict (705)
Interpretive Empathy (723)
Language Frustration (705 and 713)
Neologisms (770) Guilt (705 and 713)
Stereotyped (705) Happiness (747)
Vocabulary level (705) Hate (720)
Unusual wording (770) Independence (713)
Fluency (705) Inferiority (705)

58
Paranoid (705) developmental criteria can be devised, and (3)
Parental anger to child (743) an objectively defined scoring system can be
Pleasant (705) developed which will contribute useful information
Pleasure (713) regarding development between ages 6 and 12
Sadistic (705) years.
Security (713) It seems unlikely, in light of the literature
Number of causal relations (761) reviewed, that scoring scales can be constructed
Number of purposeful relations (761) which will measure factors such as motivation,
Outcomes (713, 763, 772, and 775) affective states, and personality traits. However,
Failure this is not serious since there is no indication that
Success these factors have any developmental impli-
Aggressive (772) cations.
Clarity of statement (705) The anticipated developmental scales would
Bizarre (763) greatly enrich the information obtained in the
Self-reference (705) Survey by possibly providing developmental norms
Number of personalized statements (705 and with regard to behavioral aspects not encompassed
714) by the other tests, such as verbal expression,
Degree of response certainty (705) thematic content of imagination in standard test
Level of interpretation (Eron, 712) situations, associations to standard stimuli, role
Symbolic concepts and attitudes in relation to self, peers of
Abstract same and opposite sex, parental and adult figures,
Descriptive and common cultural values.
Unreal While the picture samples are limited, they
Fairy tale appear to be well chosen for the purpose. Card 1
Central character not in picture has a boy as the central figure; card 2, a girl;
Autobiographical card 5, an adult-parental (mother) figure; and
Continuations card 8BM, a possible stressful situation-involv-
Alternate themes ing a father figurewithin the experience back-
Comments ground of most school-age children. Card 16, the
Denial of theme blank card, is completely unstructured. As a set
Rejection of cards having nearly universal applicability in
Peculiar a United States national sample, the selection
Confused appears excellent.
Includes examiner in story One of the advantages that an investigator
No connection between story and picture working on this problem would have over most of
Humorous those who have published reports in this area is
the large sample obtained under standardized
survey conditions. With adequate funds to work
PROSPECTS FOR DEVELOPING
with a fairly large sample of perhaps 1,000 or
AN OBJECTIVE SCORING KEY more cases, a good test of these conclusions
FOR THE SURVEYS TAT could be made. Of course, there is no guarantee
that the results will be entirely satisfactory,
Although the TAT literature is scientifically although the prognosis appears good.
sloppy in comparison with the material reviewed However, the Survey is committed to doing
in relation to the WISC and the Draw-A-Man Test, something with these data, and no suitable scoring
the following assumptions seemed warranted: (1) procedure is presently available. In the writers
a substantial number of items (both formal-struc- judgment, the options available were nearly all
tural and thematic-interpretive) can be reliably unsatisfactory, and the one taken may prove to be
defined and accurately scored, (2) discriminating a wise decision.

59
BIBLIOGRAPHY

General References to TAT

701. Mayman, M.: Review of the literature on the Thematic 716. Brayer, R., Craig, G., and Teichner, W.: Scaling difficulty
Apperception Test, in David Rapaport, Diagnostic Psy- values of TAT cards. J. Project. Tech. 25:272-276, 1961.
chological Testing. Vol. 11, The Theory, Statistical
Evaluation, and Diagnostic Application of a Battey of TAT: Scoring Schemes
Tests. Chicago. Year Book Publishers, 1946. pp. 496- 717. Eron, L. D., Terry, D., and Callahan, R.: The use of
506.
rating scales for emotional tone of TAT etories. J. Con-
702. Piotrowski, Z. A.: A new evaluation of the Thematic Ap- suit.Psychol. 14:473-478, 1950.
perception Test. Psychoanalyt.Rev. 37:101-127, 1950. 718. Fine, R.: A scoring scheme for the TAT and other ver-
703. Lindzey, G.: Thematic Apperception Test, interpretive bal projective techniques. J. Project. Tech. 19:306-309,
assumptions and related empirical evidence. Psychology 1955.
t+dl. 49:1-25, 1952. 719. Friedman, I. :Objectifying the subjective, a methodolog-
704. Windle, C.: Psychological tests in psychopathological ical approach to the TAT. J. Pwject. Tech. 21:243-247,
prognosis. Psychology BuU. 49:451-482, 1952. 1957.
705. Hartman, A. A.: An experimental examination of the The- 720. Lesser, G. S.: Application of Guttmans scaling method
matic Apperception Technique in clinical diagnosis. to aggressive fantasy in children. Educ.Psychot.hreasur.
Psychological Monographs. Vol. 63, No. 8 (Whole No. 18:543-551, 1958.
303). Washington, D.C. American Psychological Asso- 721. Dana, R. H.: Proposal for objective scoring of tbe TAT.
ciation, Inc., 1950. Percept. Mot. Skills 9:27-43, 1959. -
706. Eron, L. D.: Some problems in the research application
of the Thematic Apperception Test. J. Project. Tech, TAT: Stability, Reliability
19:125-129, 1955.
722. Porter, F. S.: A S$udy of Certain Aspects of the Relia-
707. Lindzey, G., and S]lverman, M.: Thematic Apperception bility and Validity of the Thematic Apperception Test.
Test, techniques of group administration, sex differ- Unpublished masters thesis, Iowa State University, 1944.
ences, and the role of verbal productivity. J. Personat-
723. Harrison, R., and Rotter, J. B.: A note on the reliability
ity 27:311-323, 1959.
of the Thematic Apperception Test. J. Abnom. &Sociat
708. Sanford, R. N., and others: Physique, personality and P,sycho2. 40:97-99, 1945.
scholarship; a cooperative study of school children, in
724. Jeffre, M. F. D.: A Critical Study of the Thematic Ap-
Society for Research in Child Development, MonogTaph, perception Test Performance of Normal Children. Un-
Vol. 8, No. 2. Washington, D.C. National Research
published masters thesie, University of Iowa, 1945.
Council, 1943.
725. Mayman, M., and Kutner, B.: Reliability in analyzing
TAT: Normative Data Thematic Apperception Test stories. J. Abnorm.&Socia2
P.sychol. 42:365-368, 1947.
709. Cox, B. F., and Sargent, H. D.: The common responses
726. Kagan, J.: The stability of TAT farrtasv and stimulus
of normal children to ten pictures of the Thematic Apper-
ambiguity. J. Consult. P;ychol. 23:266 -2~1, 1959.
ception Test series; abstracted, Am. Psychologist 3:363,
1948. TAT: Validity Studies
710. Bell, J. E.: A comparison of childrens fantasies in two
727. Calvin, J. S., and Ward, L. C.: An attempted experimen&
equated projective techniques; abstracted, Am .Psycho?- al validation of the Thematic Apperception Test. J. C2in.
ogist 3:263, 1948.
Psycho2. 6:377-381, 1950.
711. Whitehouse, E.: Norms for certain aspects of the Themat-
728. Saxe, C. H.: A quantitative comparison of psychodiag-
ic Apperception Test on a group of nine and ten year nostic formulations from the TAT and therapeutic con-
old children; abstracted, Persona 1:12-15, 1949.
taccs. J. Consult. Psychol. 14:116-127, 1950.
712. Eron, L. D.: A normative study of the Thematic Apper-
729. Davenport, B. F.: Tbe semantic validity of TAT inter-
ception Test. Psychological Monographs. Vol. 64, No.
pretations. J. Consrdt.Psychol. 16:171-175, 1952.
9. Washing@n, D.C. American Psychological Associa-
tion, Inc., 1950. 730. Bendig, A. W.: Predictive and postdictive validity of
need achievement measures. J. Ed. Res. 52:119-120, 1958.
713. Cox, B., and Sargent, H. D.: TAT responses of emotion-
ally disturbed and emotionally stable children, clinical 731. Henry, W. E., and Farley, J.: The validitv of tbe The-
judgment versus normative data. J. Project .Tech. 14:61- matic Apperception Test-in the etudy of adolescent per-
74, 1950. sonality. Psychological Monographs. Vol. 73, No. 17
(Whole No. 487). Washington, D.C. American Psycholog-
714. Armstrong, M. A. S.: Childrens responses to animal and
ical Association, Inc., 1959.
human figures in thematic pictures. J. ConsuZt.Psychol.
18:67-70, 1954. TAT: Ambiguity Values of Cards
715. Fisher, G. M., and Shotwell, A. M.: Preference rankings
732. Bijou, S. W., and Kenny, D. T.: Tbe ambiguity values of
of the TAT cards by adolescent normals, delinquents,
TAT cards. J. ConsuZt.Psychol. 15:203-209, 1951.
and mental retardates. J. Project .Tech. 25:41-43, 1961.

60
733. Davenport, B. F.: The Ambiguity, Universality, and R e- TAT: Environmental Variations; Culture, Social Class,
liable Discrimination of TAT Interpretations. Unpub- Race, Ethnic Group, Home Conditions, Sex Role,
lished doctoral dissertation, University of Southern Cali- Sociometric Status, Social Acceptance
fornia, 1951.
734. Murstein. B. I.: The relationship . of stimulus ambiwitv 749. Henry, W. E.: The Thematic Apperception Technique in
-. the study of culture-personality relations. Genet.Psy-
on the TAT to the productivity of themes. J. Consult.
choZ.Monogr. 35:3-135, 1947.
PsychoZ. 22:348, 1958.
750. Mason, B., and Ammons, R. B.: Note on social class
TAT: Examiner Influence, Interpreter Influence and tbe Thematic Apperception Test. Percept.Mot. SkiLZs
6:88, 1956.
735. Milamj J. R.: Examiner influences on Thematic Apper-
ception Test stories. J. Project. Tech. 18:221-226, 1954. 751. Fisher, S., and Fisher, R. L.: A projective test analyeis
of ethnic subculture themes in families. J. Project, Tech.
736. Young, R. D., Jr.: The Effect of the Interpreters Per- 24:366-369, 1960.
sonality on the [nterpre tation of Thematic Apperception
Tests ProtocoZs. Unpublished doctoral dissertation, 752. Mitchell, H. E.: Social class and race as factors affecb
University of Texas, 1953. ing the role of the family in Thematic Apperception Test
stories; abstracted, Am.Psychologist 5:299-300, 1950.
TAT: Effects of Intelligence, Achievement 753. Mussen, P. H.: Differences between the TAT responses
of Negro and white boys. J. ConsuH.PsychoZ. 17:373-
737. Edelstein, R. T.: The Evaluation of Intelligence From
376, 1953.
TAT ProtocoZs. Unpublished masters thesis, College
of the City of New York, 1956. 754. Mussen, P. H.: Some personality and social factors re-
lated to changes in childrens attitudes toward Negroes.
738. Kagan, J., &mtag, L. W., Baker, D. T., and Nelson, V.
J. Abnorm.t2SociaZ Psychol. 45:423-441, 1950.
L.: Personality and IQ change. J. Abnorm.&Social Psy-
cho?. 56:261-266, 1958. 755. Shields, D. L.: An Investigation of the Influences of
Disparate Home Conditions Upon the Levez at Which
739. Murstein, B. I., and Collier, H. L.: The role of the TAT
Chizdren Responded to the Thematic Apperception Test.
in the measurement of achievement as a function of ex-
Unpublished masters thesis, University of Pittsburgh,
pectancy. J. Project. Tech. 26:96-101, 1962.
1950.
TAT: Personality Variables 756. McArthur, C.: Personality differences between middle
and upper classes. J. Abnorm.&SociaZ Psychoz. 50:247-
740. McDowell, J. V.: Devezopmentdspects of Phantasy Pro- 254, 1955.
duction on the Thematic Apperception Test. Unpublished
757. Cox, F. N.: %ciometric status and individual adjust-
doctoral dissertation, Oklahoma State University, 1952.
ment before and after play therapy. J.A bnorm,&SociaZ
741. Cook, R. A.: Identification and ego defensiveness in the- Psychoz. 48:354-356, 1953.
matic apperception. J. Project. Tech. 17:312-319, 1953.
758. Herman, G. N.: A Comparison of the TAT Stories of Pre-
742. Mussen, P. H., and Naylor, H. K.: Relationships between adolescent School Children Differing in SociaZ Accept-
overt and fantasy aggression. J. Abnorm.GSociaZ Psy- ance. Unpublished masters thesis, University of Toron-
choz. 49:235-240, 1954. to, 1952.
743. Kagan, J.: Socialization of aggression and the percep- 759. Milner, E.: Effects of sex role and social status on the
tion of parents in fantasy. ChiZd Development 29:311- early adolescent personality. Genet.Psychol.Monogr.
320, 1958. 40:231-325, 1949.
744. Fitzgerald, B. J.: The Relationship of Two Projective 760. l%tler, O. P.: Parent Figures in Thematic Apperception
.
Measures to aSociometric Measure of Dependent Behav- Test Records of Childreri in Disparate Family Situations.
ior. Unpublished doctmal dissertation, Ohio State Uni- Unpublished doctoral dissertation, University of Pitts-
versity, 1959. burgh, 1948.
745. Breger, L.: Conformity and the Expression of Hostizity.
T.4T: With Feebleminded, Retarded, Handicapped, Brain
Unp~blished doctoral dissertation, Ohio State University, Injured, Palsied, Disturbed, and Psychotic Children
1961.
761. Slack, C. W.:Some intellective functions in the Thematic
TAT: Effects of Set, Recent Experience,
Apperception Test and their use in differentiating endog-
Stimulus Variables
enous feebIemindedness from exogenous feebleminded-
746. Lubin, B.: Some effects of set and stimulus property on ness. Train. Sch. BuL?. 47:156-169, 1950.
TAT stmries. J. Project .Tech. 24:11-16, 1960. 762. Tolman, N. G., and Johnson, A. P.: Need for achieve-
747. Newbigging, P. L.: Influence of a stimulus variable on ment as related h brain injury in mentally retarded chil-
shies told to certain TAT pictures. Can.J.Psycho Z. dren. Am.J.Ment.Deficiency 62:692-697, 1958.
9:195-206, 1955. 763. Gtirevitz, S., and Klapper, Z. S.: Techniques for and
748. Coleman, W.: The Thematic Apperception Test. I, Ef- evaluation of the responses of schizophrenic and cere-
fect of recent experience. H, Some quantitative obser- bral palsied children to the Childrens Apperception Test,
vations. J. CZin.PsychoZ. 3:257-264, 1947, (C. A.T.). Quart. J. Chizd Behavior 3:38-65, 1951.

61
764. Abel, T. M.: Responses of Negro and white morons to 781. COX, B., and Sargent, FL: TAT responses of emotionally
the Thematic Apperception Test. Am .J.Ment. De~iciency disturbed and emotionally stable children. J.Pi-eject.
49:463-468, 1945. Tech. 14:61-74, 1950.
765. Beier, E. G., Gorlow, L., and Stacey, C. L.: The fantasy 782. Dana, R. H.: Norms for three aspects of TAT behavior.
life of the mental defective. .4rn.J.Ment.Deficiency 55: J. Genet.Psychol. 57:83-89, 1957.
582-589, 1951. 783. Fine, R.: Manual for Scoring Scheme for Verbal Projec-
766. Holden, R. H.: The Childrens Apperception Test with tive Techniques (TAT, MAPS, Stones, and tke Like).
cerebral palsied and normal children. Child Detielop- Washing@n, D.C. Veterans Administration, 1948.
rnertt 27:3-8, 1956.
784. Fry, F. D.: Manual for scoring the TAT. J. Psychol. 35:
767. Hood, P. N., Shank, K. H., and Williamson, D.: Environ- 181-195, 1953.
mental factors in relation to the speech of cerebral pal-
785. Hartman, A. A.: An experimental examination of the The-
sied children. J. Speech & Hearing Disorders 13:325-331, matic Apperception Technique in clinical diagnosis.
1948.
Psychological Monographs. Vol. 63, No. 8 (Whole No.
768. Bergman, M., and Fisher, L. A.: The value of the The- 303). Washington, D.C. American Psychological Asso-
matic Apperception Test in mental deficiency. Psychiat. ciation, Inc., 1950. pp. 1-48.
Quart. Suppl. 27:22-42, 1953.
786. Henry, W. E.: The Analyeis of Fantasy. New York. John
769. Ericson, M.: A study of the Thematic Apperception Test Wiley and Sons, Inc., 1956.
as applied b a group of disturbed children; abstracted, 787. Klebanoff, S.: Personality factors in symptomatic chronic
Am. Psychologist, 2:271, 1947.
alcoholism as indicated by the Thematic Apperception
770. Leitch, M., and Schafer, S.: A study of the Thematic Ap- Test. J. Consult. Psycho2. 11:111-119, 1947.
perception Tests of psychotic children. Am.J.Orthopsy - 788. Murray, H. A.: Explortitions in Personality. New York.
chiat. 17:337-342, 1947.
Oxford University Press, 1938.
771. Shank, K. H.: An Analysis of the Degree of Relationship
789. Rappaport, D.: The Thematic Apperception Test, Ch. IV,
Between the Thematic Apperception Test and an Origi- in Diagnostic Psychological Testing, Vol. 11, Chicago.
nal projective Test in Measuring Symptoms of Per80nal-
Yearbook Publishers, Inc., 1946.
ity Dynamics of Speech Handicapped Children. Unpub-
790. Shorr, J. E.: A proposed system for scoring the TAT. J.
lished doctoral dissertation, University of Denver, 1954.
Ctin.Psychol. 4:189-195, 1948.
772. Christensen, A.. H.: A Quantitative Study of Personality
791. Shne, H.: The TAT Aggressive Content Scale. J. Proj.
Dyruzmics in Stuttering and Non-Stuttering Siblings. Un-
Tech. 20:445-452, 1956.
published masters thesis, University of Southern Cali-
fornia, 1951. 792. Terry, D.: The use of a rating scale of level of response
in TAT stories. J. AbnoTm, &Social Psychol. 47:507-511,
773. Young, F. M.: Responses of juvenile delinquents b the
1952.
Them&ic Apperception Test. - J. Cenet.Psychol. 88:251-
259, 1956. 793. Tomkins, S. S., and Tomkins, E. S.: The Thematic Ap-
perception Test, the Theory and Technique of Interpre-
TAT: With CAT and Michigan Picture Test tation. New York. &une and Stratton, 1948.
794. White, R. K.: Value Analysis, the Nature and Use of the
774. Symonds, P. M.: Adolescent Fantasy, an Investigation
!fethod. New York. Society for the Psychological Study
of the Picture-Story Method of Personality Study. New
York. Columbia University Press, 1949. of Social Issues, 1951.
795. Wyatt, F.: The scoring and analysis of the Thematic Ap-
775. Light, B. H.: Comparative study of a series of TAT and
perception Test. J. P;ychoZ. 24:319-330, 1947.
CAT cards. J. Clin.Psychol. 10:179-181, 1954.
776. Andrew, G., Walton, R. E., Hartwell, S. W., and Hutt, M. Other References Cited in Text
L.: The Michigan Picture Test, the stimulus value of
the cards. J. Consult. Psychol. 51:51-54, 1951. 796. Cattell, R. B.: Personality and Motivation Structure and
Jfeasurement. New York. Hatcourt, Brace and World,
Special Bibliography of TAT Scoring Systems 1 1959.

777. :Indrew, G., Hartwell, S. W., Hutt, M. L., and Walton, R. 797. Kelly, G. A.: The theory and technique of assessment,
E.: The Michigan Pictuw Test. Chicago. Science Re- in P. R. Farnsworth and Q. McNemar, eds., Annual Re-
search Associates, Inc., 1953. view of Psychology, Vol. 9. Palo Alto, Cali f. .Annual
Reviews, Inc., 1958.
778. Arnold, M. B.: A demonstration analysis of the Thematic
Apperception Test in a clinical setting. J. Abnorm. & 798. McClelland, D.: Studies in Motivation. New York. Ap-
Social Psychol. 44:97-111, 1949. pleton-Century-Cro fts, Inc., 1955.

.Aron, B.: A Manual for Analysis of the Thematic Apper- 799. Murray, H. A.: Thematic Apperception Teet, Pictures
779.
ception Test. Berkeley, Calif. Willis E. Berg, 1949. and Manual. Cambridge. Harvard University Press,
1943.
780. Bellak, L.: A Guide to the interpretation of the Thematic
Apperception Test. New York. The Psychological Cor- 800. Shneidman, E. S., Joel, W., and Little, K. B.: Thematic
poration, 1947. Test Analyeis. New York. Grune and Stratton, 1951.

lSee also 717 tQ 721.


62
V. TOTAL PSYCHOLOGICAL TEST BATTERY

The foregoing reviews of the several com- norms (which is recommended) without analysis
ponents of the Surveys psychological test battery of the raw score distributions on the national
have discussed the strengths and weaknesses of sample might lead to some errors. The adminis-
each test and the problems involved in estimating tration of the Draw-A-Man Test in the Survey
population parameters on a national scale from was different from that recommended by Harris,
the sample data. In each case a number of specific and it would be prudent to proceed empirically
problems were raised, and suggestions for treat- rather than to assume that the Survey drawings
ment of data or for further research have been are equivalent. In addition, Harris own norms do
made in the respective sections of the report. not reflect as good a national sample as even the
However, the most important common problem WISC, for which further standardization is un-
derives from the examination of the standardi- questionably justified.
zation basis of these tests. The norms for the One of the major problems with the WISC
WISC are unquestionably the most s~tisfactory, subtests is that of examining further the optional
with the Draw-A-Man being second; the adequacy basis for estimating Full Scale IQs from the
of the Wide Range Achievement Test norms has Vocabulary and Block Design scores. Even if
been questioned (see section II). Finally, new restandardization should reveal no need for re-
norms, related to the scoring system to be sealing the subtest items, the adoption of published
developed for the TAT, are yet to be constructed. conversion tables or direct proration is con-
In order to achieve the soundest possible sidered unjustified without further research. This
basis for population estimates with this battery, is discussed in more detail in section 1.
it is recommended that new national norms, based The information expected from the test
on the total Survey sample, be developed for all battery may be summarized as follows:
of the tests before any final population estimates 1. WISC Vocabularyscore. This test indi-
are published. While some preliminary estimates vidually provides a good estimate of g,
may be warranted, using norms provided by the the common general intelligence factor
test publishers, the discussions in the individual in the WISC, and may be accepted as a
sections of the report point up the necessity of good measure of the verbal component
the recommended restandardization. of the general measure of intelligence.
In the event that this work cannot be fully 2. WISC Block Design-score. This test is
supported, the order of priority indicated by the also well saturated in g and second only
review would place the reanalysis of the WRAT to Vocabulary in reliability. It should be
first, the Draw-A-Man Test second, and the WISC accepted as a strong nonverbal intelli-
third. It is assumed that this must be done for the gence test and as an estimate of the non-
TAT when a new scoring procedure is completed verbal component of the full test.
and adopted. 3. DYaw-A-A4an Test Goodenough-Harris
The issues in relation to the WRAT are as standard score. llle Goodenough-Harris
follows: (1) No adequate sampling plan was fol- standard score (preferably restandard-
lowed in standardizing the 1963 revision, and, in ized on the total Survey sample) can be
fact, the bias of the sample is clearly mentioned interpreted as a deviation IQ, in a manner
in the manual. (2) The test scores used to compile comparable to the WISC IQs. This score
the sample by levels are not equivalent; therefore, is a reliable and reasonably valid non-
only limited confidence can be placed in the re- language measure of mental maturity.
sulting norm levels, even though substantial 4. WRAT Oral Reading grade equivalent
correlation of the WRAT scales with concurrent (Rq).
criteria appears likely. 5. WRAT Oval Reading-standard score
In the case of the Draw-A-Man Tes~, it is (Rss).
recognized that (1) the Goodenough norms are 6. WRAT Amthmeticgrade equivalent (Aq).
outmoded, and that (2) the use of the Harris 7. WRAT Aritnmetzcstandard score (Ass).

63
Both the grade equivalents and the stand- sources within the Health Examination Survey.
ard scores will be useful for the WRAT In this type of analysis it might also be profitable
Reading and Arithmetic subtests (partic- to explore patterns based on scores representing
ularly if they are restandardized on the discrete residuals, with common variance mar-
total Survey sample). The grade equiva- tialled out and represented by an additional
lents will permit assessment of school variable.
retardation, while the standard scores, Computer programs for these types of analy-
which have the same characteristics as sis are available, and such studies could be con-
deviation IQs, will be more appropriate ducted economically on subsamples of the Survey
in pattern analytic combination with the sample.
WISC and Draw-A-Man scores. The inclusion of these psychological tests in
8. TAT developmental score(s). This may the National Health Survey was a very important
actually be a series of scores. It is entered step which has tremendous practical value to the
symbolically at this time. health, education, and welfare fields and which
also has immense scientific value in the life
It is possible to think of these data as pro- sciences concerned with child development. De-
viding individual profiles or patterns which sup- spite the technical criticisms, which are in-
plement information represented by the individual evitable in a problem of the magnitude of this
scores. For example, some children may rank national survey, the tests have been judged to be
high or low on all scales, indicating general ex- either a good choice or at least an eminently
cellence or retardation in comparison with the reasonable compromise with reality within the
general population. There may also be discrimi- constraints of the Survey.
nable test patterns associated with such special The research recommended should be looked
conditions as reading disability, mental defi- on as an unprecedented opportunity to contribute
ciency, scholastic retardation, verbal impair- toward adequate mental measurement of children.
ment due to physical or social reasons, behavior It is important for those working in this Survey
disorders, and cultural deprivation. If such pat- to bear in mind that this is the first general sur-
terns exist, it should be possible to identify them vey of psychological functions of children ever
by a standard research design based on discrim- conducted on a sophisticated national sample.
ination of experimentally formed criterion groups. The standardization programs for the tests re-
A hierarchical grouping analysis of score profiles, viewed and for others referred tofail to qualify
seeking to identify characteristic profiles of for this distinction. National psychological sur-
groups, would be an alternative approach. veys of adults have been made in both WorlcI Wars,
In this procedure, identification of criterion and recently a national survey of adolescents was
characteristics of the groups would follow rather conducted by Project TALENT. However, Cycle II
than precede the main analysis. In either case, is, to the writers knowledge, the first one of its
criterion data would be obtained from record kind in the age range of 6 to 12 years.

V1. CROSS-DISCIPLINARY ANALYSES

The complete data of Cycle II may be regarded level, twin status, number of siblings, and ages
as composing a matrix of several thousand vari- of parents). Some will require prescheduled
ables (specific measures or components of meas- analysis and computation of indexes according to
urement procedures) over a sample of nearly established procedures in the respective fields
8,000 children. In the processes of data reduction (e.g., visual acuity, exercise tolerance, and
and analysis, many of these variables will remain electrocardiogram), while others will :require
in the matrix without further manipulation (e.g., extensive processing on the basis of empirically
height; weight, body temperature, family income constructed or revised scoring keys and norms,

64
as in the case of the psychological tests dis- validation and to derive further indexes, such
cussed in this review. as peev rejectmn (based on interpersonal
Upon completion of segmental analysis of each relations and popularity), gene~al tijustment,
testing and examining procedure and reduction of and general adequacy (based on a frequency
all data to indexes and primary variables, it would count of negative citation).
be desirable to consider multivariate analysis of
Child medical history: prenatal and birth cir-
the resulting matrix. This type of approach will
cumstances, food habits, enuresis, thumb-
undoubtedly reveal many significant interrelation- sucking, age of walking, early
talking,
ships not previously investigated because of lack
learning rate, attendance at kindergarten,
of appropriate data. It is premature to consider
experience of unconsciousness, bad burns
it now, howeVer, before the reduced data schedule
(with resulting scars), serious illness, weak-
is more definitely known.
ness, nightmares, sleeping arrangements,
The primary purpose of the present dis-
age at puberty (girls). (Frequency distribu-
cussion is to explore possible linkages between
tions of these items, particularly of food
the psychological tests in the Survey battery and
habits, which wmld also provide a basis for
other variables. This, too, is a formidable task,
judging food idiosyncrasies, and sleeping
but some important areas of investigation are
arrangements, which should correlate with
opened up by this Survey, and these opportunities
SES but may also relate to other variables,
for significant research deserve special mention.
should be of great interest. Correlations of
many of these items with other data may be
DATA AVAILABLE extremely important, as, for example, the
investigation of sequelae of early uncon-
From various sources within the Survey, data
sciousness and the development of a gyowth
on items such as the following, which have im-
retardation classification, a disturbance in-
portant behavioral implications, will be available:
dex, and a weakness index.)
Parentsage, nativity, education, income level, Childsensory and motor indexes: visual acuity,
language spoken, psychiatric history, marital color vision, hearing indexes, handedness,
status, handedness, and use of medical care. grip strength, vital capacity, exercise toler-
(The distributions of these variables are of ance.
interest. In addition, an SES index of socio- Childbody measurements: height, weight, an-
economic level can be derived.) thropometry, X-ray, dentition.
Siblingsnumber, twins, ages, education, marital Childpsychophysiological indexes: blood pres-
status, work status. (From these data an sure, temperature, electrocardiogram, pho-
additional variable, birth ovdinal position, nocardiogram.
can be derived.) Childmedical fimii~s: health status, pathology.
Familysize, living status, ethnic classification, Child@ycholo&cal tests: IQ estimates; verbal
race, SES. ability level; performance ability level;
Childschool infomzatioz grade placement; reading, arithmetic, maturity level; adjust-
progress rate; absences; characterization as ment index.
requiring special provisio~l ior hard of hear-
ing, visually handicapped, speech therapy, ANALYSES INDICATED
orthopedically handicapped, gifted, slow
learning, mentally retarded, emotionally dis- The organization and ordering of the lines of
turbed; description in relation to adjustment, analysis suggested in this section are tentative
attention, interpersonal relations, discipline, and are not intended to suggest priorities. In
popularity, intellectual ability, academic per- most cases, further study of the literature in the
formance. (These data are worthy of some particular areas and consultation with qualified
detailed analysis in order to formulate ex- professional persons would be appropriate before
ternal rating criteria for independent test committing time and funds to particular studies.

65
Nevertheless, the richness of this databank is plied in this broad prescription are the following
recognized as a source of new scientific knowledge, types of investigations:
and it is hoped that it can be adequately exploited. 1. Reading disability. Effects of visual and
auditory impairment; handedness; SES;
Growth Indexes growth trends; developmental history;
early, recent, and continuing emotional
It is expected that mean growth indexes for disturbance; illness; birth order, etc.
boys and girls will be computed for as many 2. Mental veta,vdation. Every item in the
functions as possible over the six age periods. above enumeration is potentially related
Analysis of relations among growth trends to mental retardation.
separately for boys and for girls - and of growth 3. School retardation. Same as above.
rate patterns would be of direct interest and
4. Awalyses OJdiscrepancies between actual
would also permit comparison of pattern indexes
and predicted status in relation to con-
with psychological test scores. Sex differences in
comitant or associated factoys. These
growth patterns and relations of sex-related data offer an excellent opportunity to look
patterns to test scores are also of great interest.
for significant variance associated with
overachievement and underachievement in
Other Factors Related to Test Scores school grade placement, reading achieve-
ment (WRAT and school report), scho-
Discriminant pattern analyses might be un-
lastic achievement (school report, WRAT
dertaken systematically in a multivariate design
Arithmetic), and peer relations (deviation
to investigate parental, sibling (including birth
from central tendency).
order and twin resemblance for the twin sample),
family, school, medical, sensory and motor, While more detailed and specific investi-
anthropometric, psychophysiological, and medical gations could be enumerated, it is more con-
correlates of psychological test scores. While structive to emphasize the advisability of using
thie recommendation may appear forbidding in the multivariate approach, since computer equip-
magnitude, the multivariate approach is actually ment and programs are available for such analyses
more efficient and economical in total perspective and since results of greater value can be obtained
than piecemeal analyses. Among the studies im - at a far lower unit cost.

Acknowledgments

The literature review and preparation of ab-


stracts was under the immediate direction of
Samuel H. Cox, Research Associate at the Institute
of Behavioral Research. Principal persons assist-
ing Mr. Cox were Robert M. Marx, John McCrady,
Henry Orloff, and Max S. Taggart II.
The project also was greatly expedited through the
efforts of Miss Johnoween Gill, Reference Librar-
ian, Texas Christian University.
Without the loyal and competent help of these
individuals this report could not have been com-
pleted in only 3 months.

66
GLOSSARY OF ABBREVIATIONS

BD: Block Design subtest of the Wechsler Intelligence Scale for Children
CA: Chronological age
CAT: Childrens Apperception Test
CMAS: Childrens Manifest Anxiety Scale
CRT: California Reading Test
CTMM. Chicago Tests of Primary Mental Abilities
E-G-Y Kent E-G-Y Test (Scale D, Kent Series of Emergency Scales)
FRPV: Full-Range Picture Vocabulary Test (by Ammons)
FS: Full Scale (or Full Score) of the Wechsler Intelligence Scales

& General, or global, intelligence factor


HES: Health Examination Survey
IQ: Intelligence quotient
M: Mean
Iwk Mental age
N. Number
ns: Not significant
PPVT: Peabody Picture Vocabulary Test
Ps: Performance Scale (or Performance Score) of the WechsIer Intelligence tests
R. Range
r: Correlation
RT: Response time
SAT: Stanford Achievement Test
s-B Stanford- Binet Intelligence Scale
SES Socioeconomic status
SRA: Science Research Associates, Inc.
SRA-PMA: SRA Primary Mental Abilities
55: Standard score
TAT: Thematic Apperception Test
Voc.: Vocabulary subtest of the Wechsler Intelligence Scales
Vs: Verbal Scale (or Verbal Score) of the Wechsler Intelligence tests
WAIS: Wechsler Adult Intelligence Scale
WISC: Wechsler Intelligence Scale for Children
WRAT: Wide Range Achievement Test
67
if U. S. GOVERNMENT PRINTING OFFICE :1975 5E4-528/3E
OUTLINE OF REPORT SERIES FOR VITAL AND HEALTH STATISTICS
Originally R&lic Health Service Publication No. 1000

Series 1. Programs and collection f.3roce&res,- Reports which describe the general programs of the National
Center for Health Statistics and its offices and divisions, data collection methods used, definitions,
and other material necessary for understanding the data.

Sanes 2, Data evaluation and methods research. Studies of new statistical methodology including: experi-
mental tests of new survey methods, studies of vital statistics collection methods, new analytical
techniques, objective evaluations of reliability of collected data, contributions to statistical theory.

Senes 3. Analytical sttidies.- Reports presenting analytical or interpretive studies based on vital and health
statistics, carrying the analysia further than the expository types of reports in the other series.

Series 4. Documents and committee reports. - Final reports of major committees concerned with vital and
health statistics, and documents such as recommended model vital registration laws and revised birth
and death certificates.

Series 10. Data from the Health Interview Survey; -Statistics on illness, accidental injuries, disability, use of
hospital, medical, dental, and other services, and other health-related topics, based on data collected
in a contittuing national household interview survey.

Series 11. Data from the HeaLth Examination Survey. -Data from direct examination, testing, and measure-
ment of natimal samples of the ppulation provide the basis for two types of reports: (1) estimates
of the medicalIy defined prevalence of specific diseases in the United States and the distributions of
the population with reqpect to physical, physiological, and psychological characteristics; and (2)
analysis of relationships among the varioua measurements without reference to an explicit finite
universe of persons,

Series 12, Data from the Institutional Population Surveys. - Statistics relating to the health characteristic of
perftOtlS in institutions, and on medical, nursing, and personal care received, based on national
samples of establishments provid@ these services and samples of the residents or patients.

Series 13, Data from the Hospital Discharge Swvey.- Statistics relating to discharged patients in short-stay
hospftals, based on s sample of patient records in a national sample of hospitals.

Series 14. LMti on health resources: manpower and @cilities. - Statistics on the numbers, geographic distri-
bution, and characteristics of health resources including physician, dent iats, nurses, other health
manpower occupations, hospitals, nursing homes, and outpatient and other inpatient facilities.

Sensm a. Data on mortaiity.-Various statistics on mortality other than as included in annual or monthly
reports-special analyses by cause of death, age, and other demographic variables, also geographic
I and time series analyses.

Series 21. Data on natality, marriqe, and divorce. - Various statistics on natality, marriage, and divorce other
than as Included in annual or monthly reports special analvses by demographic variables, also
geographic and time series analyses, studies of fertility.

Series 22. Data f%om the National Natality and Mortality Surveys. -Statistics on characteristics of births and
deaths not available from the vital retords, based on sample surveys stemming from these records,
I
including such topics as mortality by socioeconomic class, medical experience in the laat year of
life, characteristics of pregnancy. etc.

For a Iist of titles of reports published in these series, write to: Office of Information
National Center for Health Statistics
Public Heafth Semite, HRA
Rockville, Md. 20852

You might also like