Schwartz R. Visual Versions PDF

Md. Dalim #851746 5/11/06 D.G.
Yellow ProCyan Black

Visual Versions
Visual Versions
Robert Schwartz
A Bradford Book
The MIT Press

Cambridge, Massachusetts
London, England
© 2006 Massachusetts Institute of Technology
All rights reserved. No part of this book may be reproduced in any form by any elec-
tronic or mechanical means (including photocopying, recording, or information stor-
age and retrieval) without permission in writing from the publisher.
MIT Press books may be purchased at special quantity discounts for business or sales
promotional use. For information, please email special_sales@mitpress.mit.edu or
write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA
02142.
This book was set in Stone Sans and Stone Serif by Graphic Composition, Inc. and was
printed and bound in the United States of America.
Library of Congress Cataloging-in-Publication Data
Schwartz, Robert, 1940–

Visual Versions / Robert Schwartz.
p. cm.
“A Bradford book.”
Includes bibliographical references and index.
ISBN-13: 978-0-262-19544-7 (hc. : alk. paper) — 978-0-262-69334-9 (pbk. : alk. paper)
ISBN-10: 0-262-19544-5 (hc. : alk. paper) — 0-262-69334-8 (pbk. : alk. paper)
1. Vision. 2. Visual perception. I. Title.
B846.S39 2006
121'.35—dc22
2006041931
10 9 8 7 6 5 4 3 2 1
To my brother Jerold, for his constant love and support.
An unrivaled sibling.
Contents
Sources ix
Preface xi
Introduction 1
I Berkeleian View of Vision 9
1 Seeing Distance from a Berkeleian Perspective 13
2 Size 29
3 Making Maximum Sense of “Minimum Sensibile” 37
4 Heterogeneity and the Senses 55
5 What Berkeley Sees in the Man Born Blind 71
II Inference 91
6 The Role of Inference in Vision 95
7 Making Occlusion More Transparent 109
8 Directed Perception 123
III Picture Perception 139
9 Representation and Resemblance 143
10 Pictures, Puzzles, and Paradigms 159
11 Vision and Cognition in Picture Perception 173

viii Contents
IV Missing the Real Point 187
12 The Concept of an “Object” in Perception and Cognition 191
13 Avoiding Errors about Error 211
14 Pluralist Perspectives on Perceptual Error 233
15 An Austinian Look at the “Objects of Perception” 243
Index 255
Sources
1. “Seeing Distance from a Berkeleian Perspective,” in Berkeley’s Metaphysics:

Analytical and Historical Essays, ed. R. Muehlmann, Pennsylvania State Uni-
versity Press, 1995.
2. “Size,” in R. Schwartz, Vision: Variations on Some Berkeleian Themes, Black-
well Publishing, 1994.
3. “Making Maximum Sense of ‘Minimum Visible’,” presented at the Con-
ference on Berkeley’s Theory of Vision, International Eighteenth Century
Studies Society, Dublin, and at a Midwest Seminar in the History of Early
Modern Philosophy, Chicago, unpublished.
4. “Heterogeneity and the Senses,” presented at the “Berkeley for Today Con-
ference,” Rennes, France, unpublished.
5. “What Berkeley Saw In the Man Born Blind,” presented at International
Berkeley Conference, Tartu, Estonia, unpublished.
6. “The Role of Inference in Vision,” in Representation: Relationship Between
Language and Image, eds. S. Levialdi and C. Bernardelli, World Scientific, 1994.
“Addendum” in R. Schwartz, Vision: Variations on Some Berkeleian Themes,
Blackwell Publishing, 1994.
7. “Making Occlusion More Transparent,” unpublished.
8. “Directed Perception,” Philosophical Psychology, 9 Mar. 1996.
9. “Representation and Resemblance,” Philosophical Forum, Vol. 4, Summer
1974.
10. “Pictures, Puzzles, and Paradigms,” Philosophia Scientia, 2(1), 1997.
11. “Vision and Cognition in Picture Perception,” Philosophy and Phenome-
nalogical Research, LXII, 3, 2001.
12. “The Concept of an ‘Object’ in Perception and Cognition,” in From Frag-
ments to Objects: Segmentation and Grouping in Vision, eds. P. Kellman and
T. Shipley, Elsevier Publishing, 2002.
x Sources
13. “Avoiding Error About Error,” in Colour Perception: From Light to Object,
eds. R. Mausfeld and D. Heyer, Oxford University Press, 2004.
14. “Pluralist Perspectives on Perceptual Error,” Pluralism: Theory of Knowl-
edge, Ethics, and Politics, eds. G. Abel and H.J. Sandkühler, Meiner Pub., 1996.
15. “An Austinian Look at the ‘Objects of Perception,’” unpublished.
Preface
The writings contained in this volume are all on topics in the theory of vision.
Five are new. The remaining selections have appeared in print, albeit a num-
ber in conference proceedings or volumes not readily accessible. In addition,
many are in books or journals tending to have a readership of either phi-
losophers or psychologists, but not both. I hope this collection can bridge
these gaps. Brief excerpts from my book, Vision: Variations on some Berkeleian
Themes, and two articles sketching ideas further explored there are reprinted.
These excerpts and papers provide both continuity and background to other
readings.
All the published essays appear without significant changes. Citation in-
formation, now available, is given, and typographic errors are corrected,
when caught. Any additions to the published works are indicated with an as-
terisk, and the new material appears in brackets, *[ ]. Each selection begins
with a prescript intended to set the context for the selection. The prescripts
were not part of the original works. Most of the non-historical papers written
since 1996 stem from a project on perception sponsored by the Center for
Interdisciplinary Research (ZiF) at the University of Bielefeld. I gained much
from the year of continuous discussions and debates with members of the
project. It was also a lot of fun.
Acknowledgments of help are found in the individual selections. They do
not reflect, however, my almost weekly conversations with Sidney Morgen-
besser and the insights and enlightenment he offered. The result of his prob-
ing questions and challenges show up in many of the essays. I also profited
from Sidney’s incredible storehouse of knowledge. I never had to rely on
search engines to find references germane to my interests. Sidney was much
more effective. I will miss his philosophical acumen and so much more his
friendship.
Introduction*
The selections in this volume are grouped into four sections. By and large the
pieces can be read independently. There are, however, issues and arguments
that cut across these boundaries. There are, too, commonalities of concern
and approach that run throughout the collection. An overarching commit-
ment to pluralism and irrealism along the lines of William James, John
Dewey, and especially Nelson Goodman is presumed but not explicitly de-
fended.1 I think the advantages of adopting these stances in the study of
vision are considerable. Ill-imagined problems can be ignored, and fruitless
controversies avoided. The assumptions and intuitions they depend on are
fragile. Confronted with conflicting empirical evidence or theoretical needs,
conceptual certainties either crumble or are rendered irrelevant to substan-
tive issues.
The title of this collection, Visions and Versions, is meant to indicate the
cross-currents among the selections and the philosophical presuppositions
that underlie them. The following short summaries of the sections attempt to
highlight these matters.
Section I: The Berkeleian Perspective
As argued in my book, Vision: Variation on some Berkeleian Themes (VVBT), I

believe the study of the history of vision theory can illuminate current issues.
These earlier works set the contours of the problematic, shaping the future of
both the questions asked and the research undertaken to answer them. More-
over, these historical works often contain useful distinctions and valuable in-
sights not found in contemporary studies. I am also convinced that an
appreciation of modern developments in the theory of vision can shed light
on its history. Recent experimental and theoretical works provide perspective,
2 Introduction
knowledge, and tools enabling a better understanding of both the problems

faced in the past and the theories then developed in response.
The papers in section I all revolve around issues discussed in the writings
of Bishop Berkeley, particularly in his New Theory of Vision (NTV). Berkeley’s
theory had a pervasive influence on the science of vision far into the twenti-
eth century. Many of his basic assumptions remained firmly in place until the
1950’s, and others remain today. Chapter 1 and chapter 2 explicate Berkeley’s
views on spatial perception and draw lessons that have import for ongoing
studies of distance and size perception. Chapter 3 and chapter 4 deal with
Berkeley’s treatment of qualitative aspects of perception. Exploring how
Berkeley’s theory fits in with later studies on sensory orders and measure-
ment serves to clarify his account. It serves, as well, to abate much of the mys-
tery and implausibility commentators find in Berkeley’s claims about the
senses. Chapter 5 explores the implications of the preceding selections for
Berkeley’s response to Molyneux’s question and to his extensive use of “man
born blind” thought experiments throughout NTV.
In NTV, Berkeley develops what came to be referred to as a motor theory of
perception. Visual ideas serve as signs of tangible experience. Ideas of touch
and movement provide visual experience with spatial meaning. In NTV,
however, Berkeley does not distinguish the tangible from the physical. In
subsequent works Berkeley expounds a more full fledged Idealism. There is
no mind-independent physical world; there are only phenomenal versions.
Perceptual success lies in mastering the correlations among and between the
ideas both modalities provide. On this account, visual experience is ontolog-
ically on par with the deliverance of touch, although the latter remains prag-
matically more important. The phenomenal properties, qualities, and patterns
of visual experience are constituents of reality. The way the world visually
appears, is one way the world is. Tangible experience provides the ingredients
of another.2
Section II: Perceptual Inference
The view that perception is inferential, and thus indirect, has a long history,
and debate about it has not died down. Berkeley’s position on perceptual in-
ference is obscured by a terminological ambiguity in his writings. Berkeley
appears first to accept and then to reject the claim that vision is inferential.
But the notion of inference he initially countenances is inductive associa-
Introduction 3
tion, not calculative reasoning. In later work, Berkeley recommends using

the term “suggestion” for the former and “inference” for the latter. Nowa-
days, appeal to either kind of process tends to be associated with inference
theories of vision.3
In contemporary psychological work on vision, J. J. Gibson’s groundbreak-
ing attempts (beginning with Gibson 1950) to reconceive the whole problem
of space perception have been pivotal. Gibson maintains that perception is
direct; no processes of inference are involved. Gibson goes on to argue that
his theory supports a doctrine of perceptual Realism. He believes it stands in
stark contrast to the paradigm of indirect perception whose postulated visual
intermediaries preclude direct or immediate access to reality.
The chapters in this section explore the development and significance of
the inference controversy. Chapter 6 surveys competing characterizations
of the idea of inferential processes. Lack of agreement here is a root cause of
much of the dispute. Opponents are largely talking past one another. The ad-
dendum, an excerpt from VVBT, expands on one prominent characterization
of inference discussed in the selection. Chapter 7 examines the widely ac-
cepted claim that occlusion is a potent cue for depth perception. Cue theories
of vision have traditionally been said to be inferential and indirect. Chapter
8 critically examines James Cutting’s Gibsonian-inspired model of directed
perception. It discusses Cutting’s understanding of the inference dispute and
examines the contribution his own model of vision brings to the issue.
In one way or another, the claim that perception is inferential implies that
there is a clear, plausible distinction between visual premises and visual con-
clusions. What is given to the visual system is one thing; what is taken from it
is another. Justifying any fixed line between visions experienced and versions
inferred, though, is problematic. For some purposes a state may be deemed
“given,” the starting point. For other purposes it can with equal legitimacy be
thought of as “taken,” a stage that goes beyond the given. The distinction is
not stable. It depends on the contexts and aims of inquiry. Boundaries drawn
on the basis of epistemic and ontological considerations alone are not suffi-
ciently constrained to justify setting a unique border.
Section III: Picture Perception
The ability to understand and appreciate pictorial representations raises is-

sues that engage philosophers, psychologists, and art theorists, along with
4 Introduction
scholars in a number of related fields. Interpreting pictures, it is said, is quite

unlike interpreting the written word, since pictures resemble what they de-
pict and words do not.4 Although a resemblance account of pictorial repre-
sentation has long been prominent, it does have its critics. Nelson Goodman’s
challenge to such theories is one of the most thorough and most contested.
In Languages of Art, Goodman offers a highly original alternative model of
pictorial representation. He locates pictures as one among a myriad of sym-
bolic systems and distinguishes pictures from language and other systems in
terms of syntactic and semantic properties.
My debt to Goodman’s ideas in section III should be obvious. At the same
time, I appreciate why so many find his account difficult to accept. The intu-
ition that pictures look like what they represent has a powerful, undeniable
appeal, while the proposal that pictures function as a language runs against
the grain. The papers in this section maintain that it is necessary to over-
come the tug of resemblance theories. Each selection, however, attempts to
alleviate qualms raised by Goodman’s model. Proponents of the symbolic
analysis of pictorial representation do have the wherewithal to address these
misgivings. The model may also put the study of picture perception on a
firmer foundation.
Chapter 9 both challenges resemblance theories of pictorial representa-
tion and tries to explain what is right and wrong with various intuitions un-
derpinning this approach. Chapter 10 contrasts the resemblance paradigm of
pictorial representation with Goodman’s symbolic paradigm. A case is made
that the symbolic model can help resolve a set of puzzles actively under study
in picture perception research. In spite of these benefits, it is hard to dislodge
the conviction that the symbolic paradigm can not account for the essential
visual nature of pictorial representations—a feature thought crucial to the
distinction between pictorial and linguistic representations. Chapter 11 at-
tempts to accommodate this visuality intuition within the framework of the
symbolic model.
Failure to appreciate how sight provides both visions and versions lies at
the heart of several misunderstandings plaguing research in picture percep-
tion. Those rejecting the symbolic model argue that understanding language
is a two-step process. We see words and then have to cognitively interpret
them. In this context, picture perception is considered a one-step process.
Vision by itself can access a picture’s depictional content. Unlike linguistic
representations, pictures are not read; they are simply seen.5
Introduction 5
Section IV: Missing the Real Point
The appearance of the word “real” or its cognates is a sure sign of trouble. Qual-
ifying a property or formulating a problem in these terms tends to turn reason-
able issues into metaphysical quagmires. True, confronting such conundrums
does make it appear that deeper matters are being engaged. However, more
often than not, the more metaphysically a topic develops the less focused it
becomes. The issue at stake grows unclear and distanced from empirical and
theoretical considerations that can reign in philosophical intuitions. Ques-
tions that start out with real substance are replaced with pursuits resisting
closure. Unfortunately, enticed by the seeming foundational significance of
the real, perceptual psychologists often join the metaphysical quests. In dif-
ferent ways, the selections in this section urge both philosophers and psy-
chologists to resist the temptation.
A common response to the views expressed in section IV is that they miss
the real points of the debates. I do not deny the charge. The central aim of
these papers is to question the statement, empirical content, and, at times,
coherence of the supposed issues, as well as raise concerns about the ground
rules for arguing and settling them.6
Chapter 12 looks at recent psychological research on object perception. It
would seem that a prerequisite for such studies is having a reasonably precise
notion of an object to structure the research. But the notion of an object em-
ployed in much of this work does not seem up to the tasks assigned it. Chapter
13 and chapter 14 examine empirical research and theoretical positions that
depend on assumptions about the essential nature of color or what colors really
are. Reservations are expressed with both the content and goal of these projects.
The first essay in this section, chapter 12, explores issues germane to psycholog-
ical studies of object perception. It is fitting that the final selection returns to the
old chestnut of the philosophy of perception. “What are the real objects of per-
ception?” Austin’s attempt to dissolve the problematic is clarified and endorsed.
A number of assumptions are primarily responsible for problems examined
in this section:
1. Visual experience is subjective; thus it can only present the world as it ap-
pears, not as it is.
2. There are objective versions of the world that do capture reality in its
mind-independent, ready-made form. One version of these physics, is basic
and privileged.
6 Introduction
3. Any account of the qualitative aspects of visual phenomena must in the

end be explained naturalistically, in terms beholden only to the physical.
4. There is a genuine question about the real nature of what the visual system
passes on to cognition. Either what is transmitted lacks content and is epis-
temically inert, or visual experience is propositionally or conceptually pack-
aged and encounters difficulty providing the neutral evidence needed to
insure objectivity.
5. Intuitions of conceptual necessities—essential or constitutive—must be
honored.
Each of these assumptions is at odds with this volume’s pluralist, irrealist

commitments noted earlier. Such commitments appear mainly as an under-
current in the previous sections of this volume. They are on the surface and
play a more critical role in the readings of section IV.
Notes
* Historical and contemporary materials on the topics discussed in this collection can
be found in my anthology, Perception. Unless otherwise indicated in the selection, all
references to Berkeley’s writings are to be found in The Works of George Berkeley, Bishop
of Clone (9 vols), eds. A. A. Luce & T. E. Jessop, Edinburgh: Thomas Nelson, 1948–57.
1. Goodman’s paper “Words, Works, Worlds” is a concise, trenchant statement of this

perspective. I explore and try to justify similar pluralist and irrealist ideas in Schwartz
1985, 1986, and 2000. See also Goodman’s 1977 programmatic remarks about the sig-
nificance of phenomenalist systems of analysis and the unimportance of privileging
either physicalist or phenomenalist systems.
2. I have not included in this volume efforts to integrate Berkeley’s theory of vision
with his Idealism and related philosophical concerns.
3. Berkeley’s account is quite similar to the one found in H. Helmholtz. Helmholtz is

generally cited by vision scientists as the modern founder of an inference model of per-
ception. (See Schwartz 1994.)
4. For some recent papers, see Hecht et. al. 2003.
5. For an account of how pictorial representations can effect conceptions of the world,
see Schwartz 1985. In battles over the role of imagery in cognition, images are usu-
ally identified with pictures and said to function like pictorial representations. I dis-
cuss the implications of the symbolic paradigm for issues concerning this analysis in
Schwartz 1982.
6. In a review of David Marr’s book, Vision, (Schwartz 1985) I mention related reserva-
tions with certain philosophical uses made of his ideas.
Introduction 7
References
Goodman, N. (1977) The Structure of Appearance. Indianapolis: Bobbs-Merrill.
———. (1978) “Words, Works and Worlds” in Ways of Worldmaking. 1–22. Indianapolis:
Hackett Publishing.
Gibson, J. J. (1950) The Perception of the Visual World. Boston: Houghton-Mifflin.
Hecht, H., R. Schwartz, and M. Atherton (eds.) (2003). Looking into Pictures. Cambridge:
MIT Press.
Schwartz, R. (1982) “Imagery: There’s More to it than Meets the Eye,” in Imagery,
N. Block (ed.), 109–29. Cambridge: MIT Press.
———. (1985) “The Power of Pictures.” Journal of Philosophy 82: 711–720.
———. (1985) “Review of D. Marr, Vision.” Philosophical Review 94: 411–414.
———. (1986) “I’m Going to Make You a Star.” Midwest Studies in Philosophy 11: 427–439.
———. (1994) Vision: Variations on some Berkelean Themes. Oxford: Blackwell Publishing.
———. (2000) “Starting From Scratch: Making Worlds.” Erkenntnis 52: 151–159.
———. (2004) Perception. Oxford: Blackwell Publishing.

I Berkeleian View of Vision
Prescript 1
This paper surveys ideas developed further in chapter 1 of VVBT. It explains

Berkeley’s view that the spatial meaning of visual experience lies in its links
to the tangible. An explication of Berkeley’s much misunderstood and criti-
cized account of distance perception is offered, and a defense of his claim that
“distance is not immediately perceived” is proposed.
These days the philosophical literature is awash with competing views on
the content of visual experience. The answer to this question is thought to
have major implications for epistemology and the philosophy of mind. Much
of the controversy, though, turns on conflicting notions of “experience” and
“the conceptual” that are quite divorced from the study of spatial perception.
Berkeley’s theory of vision is a forceful reminder that solutions to problems
about perceptual content that do not take account of vision’s major role in
guiding behavior (of the cognitively endowed or deficient) are likely to come
up short.
1 Seeing Distance from a Berkeleian Perspective
Although Berkeley’s An Essay Towards a New Theory of Vision contains a prob-

ing examination of a range of topics in vision theory, the aspect of this work
most discussed and criticized has been his account of distance perception.
Now, while many of these criticisms have some point, I believe that readings
of Berkeley often misconceive the significance of crucial aspects of his psy-
chology of perception and fail to appreciate the full force of his problems and
proposals. Perhaps the extent to which Berkeley’s ideas have been differently
understood and received can be highlighted by comparing a few quotations
from representative philosophical and psychological works. Consider first
the contrasting remarks of Alan Donagan,
Although Berkeley’s theory of vision was generally received as true for over a century,
so much of it depends on the false proposition that distance cannot be immediately
seen that it has long been discredited,1
and Julian Hochberg,
The most influential theory of space perception in Western thought has been that dis-
tance is not a direct visual sensation at all. Instead . . . memories of the grasping or
walking motions that have been made in the past . . . provide the idea of distance.2
Donagan, along with numerous other commentators, is convinced that the

idea that distance perception is not immediate “has been long discredited.”
Yet if one turns to a standard psychological text, such as Hochberg’s, one finds
a much different assessment of this claim.
The following selections from George Pitcher and Herman von Helmholtz
are likewise in sharp contrast. Pitcher writes that
[W]hatever a person immediately (or directly) sees he has incorrigible knowledge
of. . . . Berkeley is firm in his espousal of [this]. . . . Many philosophers through the ages
have certainly accepted something like it as axiomatic.3
14 Berkeleian View of Vision
And here is Helmholtz:
We are not in the habit of observing our sensations accurately. . . . Thus in most cases
some special assistance and training are needed in order to observe these subjective
sensations.4
Pitcher is right when he says that many philosophers have taken it as ax-
iomatic that we have incorrigible knowledge of our sensory states. But Helm-
holtz’s account of our ability to report on our sense experience better reflects
the position of most visual theorists working in the Berkeleian tradition, in-
cluding, I would argue, Berkeley himself. The next quotations provide an-
other striking case of conflicting viewpoints. Bertrand Russell insists that
Berkeley’s theory of vision, according to which everything looks flat, is disproved by
the stereoscope.5
But James Sully demurs:
Some years ago it was commonly thought that, thanks to the argument of the Berke-
leyans, aided by experiments of Wheatstone and others, the derivative nature of visual
space was amply demonstrated.6
Russell has been joined by other critics in citing Wheatstone’s invention of

the stereoscope as damaging to Berkeley’s line of thought. As Sully points
out, however, developments in vision theory support no such conclusion. In-
deed, many of the early stereoscope experiments were taken to strengthen
Berkeley’s position.
The passage in the New Theory that has been the subject of severest criticism
appears right at the beginning. In section 2 Berkeley says, “It is, I think, agreed
by all that distance, of itself and immediately, can not be seen.” In section 11
he goes on, “[I]t is plain that distance is in its own nature imperceptible.” In
considering these passages I think it important to separate several issues that
can be easily run together: (i) Berkeley’s account of our ideas of distance,
(ii) the claim that ideas of distance gained by sight are not immediate, and
(iii) the claim that, in and of itself, distance is imperceptible by sight. While
critical discussions tend to focus on (ii), Berkeley himself is mainly concerned
with (i) and (iii). As Berkeley says, (ii) was generally accepted by all.
For Berkeley and for other vision theorists, the claim that some idea is not
immediate is an empirical claim about the process that leads to our having
that idea. Ideas are “not immediate” when they are the result of operations
that involve the processing of mental items. In contrast, immediate ideas are
ideas brought to mind by purely nonmental goings on. The processes that
Seeing Distance from a Berkeleian Perspective 15
underlie immediate ideas are, on this score, like those that underlie the out-
put of our kidney or liver; they are entirely organic or physiological in nature.
In much of the literature on vision, what Berkeley calls “immediate ideas” are
also referred to as “sensations.”
Berkeley’s own version of what makes a process mental is closely tied to the
then long prevalent identification of mental states with conscious states.
Mental processes were understood to involve manipulating ideas, which were
themselves assumed to be states of consciousness. In particular, then, the claim
that we do not see distance immediately amounts to the claim that the ideas
of distance, derived from sight, depend on mental operations; that is, they are
brought to mind via intermediate ideas.
As Berkeley notes, the claim that distance evaluation depends in this way
on the registering of pictorial and other cues was widely accepted. It was
thought to be a trivial consequence of the one-point argument. “For distance
being a line directed endwise to the eye, it projects only one point in the fund
of the eye, which point remains invariably the same, whether the distance is
larger or smaller” (NTV 2). But if distance perception is not immediate, which
aspects of vision might fall under the label “immediate”? Here matters have
been hotly contested throughout the modern history of visual studies. It
might seem, for example, that color or neutral color (the black-to-white scale)
are obvious candidates. What color or neutral color we perceive is simply de-
termined by the interplay between the properties of light and the physiolog-
ical nature of our visual receptors. No mental work is needed. Yet this sort of
explanation has its problems.
A piece of coal in sunlight looks black, while a lump of sugar indoors looks
white. The sunlit coal, however, reflects more white light than the sugar.
Treating such phenomena as sensations may thus seem problematic, since
there is no direct correlation between the stimulus intensity and the experi-
enced quality. Roughly, two types of theories have been offered to explain the
phenomena. On the psychic, or cognitive, theory, it is claimed that we im-
mediately experience a sensation that corresponds to the absolute value
of the light. The coal immediately appears white. But then our visual system
takes into account the high level of illumination. This combination of infor-
mation triggers a memory trace of a black quale, which we then experience.
The alternative approach claims that no such mental operations are neces-
sary. According to this view, the stimulus is not the absolute intensity of the
light but the ratio of the light intensities coming from the object and those in
its environment. The constant black color of the coal under different illumi-
nation is determined by the constant intensity of the ratios of the stimuli. It is
immediate, a matter of sense.
Similar conflicting approaches turn up in discussions of size and other spa-
tial properties. Consider the moon illusion. Although the size of the retinal
image of the moon is the same at its zenith and on the horizon, the moon
seems bigger on the horizon. For Berkeley the number of minimum visibilia
are the same, but we read through our immediate ideas and see the moon
differently in the two situations. In recent years, critics of this psychic ap-
proach, most prominently Gestaltists and Gibsonians, have argued that the
visual appreciation of size is simply triggered by higher-order properties of
the stimulus and is not dependent on intermediate sensations of the sort
Berkeley and others propose. Examples of these contrasting approaches, psy-
chic versus organic, could be multiplied, but this is no place to consider the
merits of each.7
If Berkeley’s use of the distinction between immediate and nonimmediate
ideas is continuous with that characteristic of work on vision both before
and after the New Theory, it might best be understood to incorporate the fol-
lowing features:
(1) Immediacy depends on the type of processing involved, not on the kind
of idea. Even to sight, certain cases of color perception, for example, need not
be immediate.
(2) The “immediate” notion does not match up with our ordinary-language
“looks,” “appears,” and “seems” locutions. The sunlit coal looks black and the
moon appears bigger on the horizon, but neither is immediate according to
psychic theories.
(3) What is immediately seen does not correspond to judgments that are
noncommittal regarding how things actually are in the world. We can protect
against factual error by claiming that the cat seems to be three feet away and
not asserting that it is three feet away, just as we can avoid commitment to the
real color of the fire engine by saying only that it looks red. Nevertheless, the
red look for Berkeley is immediate, but the three-feet-awayness is not. And
on the classic accounts of neutral color and size we are not reporting what we
immediately see when we speak guardedly and only say that the sunlit coal
seems to me to be black or the moon seems to me bigger on the horizon.
(4) Immediate ideas of sense did not typically have the epistemological status
they took on in twentieth-century philosophical discussions of the founda-
tions of knowledge and the mind/body problem. For Berkeley, as well as later
theorists, although our immediate experiences are mental states, we are not
necessarily able to report accurately on them, and they are not incorrigible.
As for the status of distance perception, the one-point argument convinced

Berkeley, along with most everyone else, that seeing distance was a two-stage
process. In vision, ideas of distance come to us by way of the prior registering
of distance cues. In this assumption, Berkeley was in accord with the optics
writers of his day as well as with most vision theorists who followed. Berke-
ley’s disagreement with the optic writers was over the nature of our ideas of
spatial distance and over the particular kind of mental processing involved in
vision. It was not over whether nonorganic or psychic operations were re-
quired for distance perception: As the psychologist James J. Gibson critically
remarked not long ago, the one-point argument “states the problem of per-
ception of the third dimension, or depth perception, as it has been studied . . .
for over 250 years.”8 Although Berkeley may be most remembered for saying
that distance is not immediate, his more original and controversial ideas
in the study of vision are found elsewhere. Recall, Berkeley also claimed that,
in and of itself, distance is imperceptible to sight. Our visual experience lacks
any inherent qualities of spatiality from which we could derive our ideas of
space. This latter claim, although related, is different from the claim that spa-
tial perception is nonimmediate. To see this, consider again our perception of
the black coal in sunlight. According to the psychic theory this is a two-stage
process, the black color is not immediately perceived. Yet this black color idea
is an idea of sight, and under more standard lighting conditions a black color
could be an immediate sensation.
Berkeley maintains that the situation is different in the case of distance.
Our idea of distance is not a visual idea at all, nor is it a construct of visual
ideas, nor is it in any way derivable from visual experience by reason, simi-
larity, or analogy. Our concept of distance in general is derived from move-
ment experience, not sight, and the content of any specific distance idea is
entirely tangible. For Berkeley, distance is not a property of our visual experi-
ence, just as color and distance are not properties of our olfactory field. We
may be able to tell by the lemony smell that the object is yellow, but yellow is
not a quality of the odor. Similarly, as the lemony smell gets stronger, we may
be able to tell that the object is approaching, but distance is not a property of
smell. We could not, moreover, acquire our ideas of color or distance if all we
had to go on was smell.
Intuitively, however, vision seems different from smell; there appears to be

something inherently spatial to our visual sensations. But according to Berke-
ley, this everyday, “vulgar” intuition is incorrect. A major reason for Berke-
ley’s claim that distance, in particular, is not an attribute of our visual field
comes from his understanding of the implications of the one-point argument,
and in this he was again joined by most theorists. What was more controver-
sial and more original was his further claim that vision lacks the wherewithal
to provide us with any of our ordinary (physical) spatial ideas, and this
includes ideas of size, shape, orientation, and direction. Berkeley does not,
however, subscribe to the doctrine some others were to adopt, that our visual
field has no intrinsic order. For Berkeley, it does not follow from an allowance
that our visual field has inherent structure that it makes sense to treat that
field as a spatial realm to which our ordinary geometric ideas can be mean-
ingfully applied or from which they can be derived. *[See chapters 3 and 4 for
elaboration and clarification.]
The dilemma of the inverted image is an important case in point. We say
that the man looks erect, but then are puzzled by the fact that the retinal im-
age is inverted. The puzzle dissolves when we realize that it makes no sense
to describe our phenomenal field as itself erect or upside down, as if it were lo-
cated in the same space as the retinal image and could be compared to it with
respect to some common idea of spatial orientation. We can, of course, come
to use visual information to determine whether an object is up or down, but
this depends on correlations with the tangible. We could not develop our
ideas of spatial orientation from visual experience alone. Such experience
lacks any intrinsic qualities of spatial upness or downess to serve as a basis for
acquiring these ideas. The same holds for our ideas of right and left. Our use
of spatial terms to describe our phenomenal field is not to be taken literally.
It is derived from our habits of interpreting the tangible significance of our
visual experience.
Berkeley’s approach to the supposed distance properties of vision is of
a piece. Berkeley does not claim that our visual experiences are flat (spatially
two-dimensional) rather than voluminous (spatially three-dimensional), a
claim that many did take to be a consequence of the one-point argument. In-
stead, he says the claim that the immediate objects of perception are planes
and not solids makes no sense. His reasons for holding this position, I think,
are not quite D. M. Armstrong’s: “[F]latness presupposes the existence of
three dimensions, for it is only surfaces which can be said to be flat or not flat,
and surfaces must be surfaces of volumes, and volumes are three dimensional.
Now Berkeley denies that objects are immediately seen as three dimensional,
and so he must deny they are seen flat.”9 Nor, I believe, would Berkeley dis-
tinguish the case of location from that of distance in the way Armstrong sug-
gests: “I can see immediately that the man is to the left of the tree, and that
the leaves of the tree are above its trunk (more strictly, all I immediately see
are certain man-like, leaf-like, and trunk-like colored shapes arranged in this
way), but I can not immediately see that the tree-like shape is more, or less,
distant than the man-like shape.”10 Berkeley claims, instead, that our visual
field, like our olfactory field, lacks anything comparable to our ideas of both
spatial distance and spatial direction. With regard to distance, however, “all
agreed.” A point anywhere along a line of sight projects the same point on our
retina whether near or far. There is no presentation of the third dimension per
se in the stimulus and, in turn, in our visual field. There is nothing in our
visual field, for example, that increases in size as the distance of the point in-
creases. *[Note, with respect to Berkeley’s Idealist position, a two-dimensional,
mind-independent world is no more welcome than a three-dimensional, mind-
independent world.]
This version of the one-point argument does not depend, as has often been
claimed, on the assumption that distance cues are necessarily ambiguous.
Cues could be unambiguous (e.g., brightness could vary directly with dis-
tance) without affecting Berkeley’s main point here. No matter how unam-
biguously such brightness ideas corresponded to distances, they would not
themselves be ideas of distance. We cannot, therefore, acquire distance ideas,
as we acquire color ideas, on the basis of visual experience alone. A spirit with
sight but no tangible sense could not have our ordinary ideas of space (see
NTV 153–59). Talk of the voluminousness or distance properties of our visual
experience is strictly derivative, reflecting the spatial or tangible significance
we have come to assign to visual phenomena.11
But then, did not the invention of the stereoscope and experiments on reti-
nal disparity show that Berkeley and those who agreed with him were mis-
taken? Many critics have assumed that these findings overturn or severely
challenge Berkeley’s theories. Such claims, however, are particularly puzzling
when one looks at the actual developments in the scientific study of vision. As
Sully reminds us, many prominent theorists (including, to an extent, Wheat-
stone himself) took the stereoscope experiments to support Berkeley’s views.
Why the discrepancy? In order to answer this question, I think it necessary
to separate again Berkeley’s different claims about the nature of distance per-
ception [(i), (ii), (iii) above].
Perhaps the easiest misunderstanding to clear up is the idea that Wheat-
stone’s invention proved that distance perception is immediate. For a long
while it had been known that, within a limited range, objects at different dis-
tances from the viewer project noncongruent images on the retina. Only ob-
jects on the plane of focus strike corresponding points on both retinas; the
retinal projections from all other objects strike disparate points (see Figure 1.1).
What the stereoscope showed was that the disparity of the images did
indeed affect or play a role in distance perception. It did not undermine the
one-point argument; rather, it indicated that there was another cue, retinal
disparity, that vision could and did tap in trying to work out distance rela-
tions. According to most models of binocular vision, this was taken to mean
that the visual system first registers disparity information and then uses it to
derive distance. The model was a two-stage operation, and in this way not dif-
ferent from the nonimmediate processing models found in dealing with pic-
torial and kinesthetic cues to distance.
In fact, experiments with the stereoscope were used to argue in favor of a
two-stage solution to another problem that was most prominent in Berkeley’s
Figure 1.1
Retinal disparity: the distance y–x is less than the distance y'–x'.
time and thereafter. This is the problem of accounting for the fact that we do
not see double even though each eye is capable of producing its own visual
experience.12 According to one account, the organic model, we are wired so
that nerve impulses from corresponding retinal points come together and
merge into a single impulse that then travels to higher brain centers, trigger-
ing but a single experience. The fact that objects not on the focal plane do
not project to corresponding retinal points, therefore, poses a challenge to
organic models of single vision. Moreover, workers like Helmholtz thought
they could demonstrate by means of stereoscope experiments that fusion
does not occur at a neural level and that we do have the distinct experiences
associated with each eye. “These experiments show . . . the content of each
separate field comes to consciousness without being fused with the other
field by means of organic mechanisms; and that, therefore, the fusion of the
two fields in one common image, when it does occur, is a psychic act.”13
If the invention of the stereoscope did not demonstrate that distance per-
ception is immediate, did it not at least deal a blow to Berkeley’s further claim
that distance is not a quality of visual experience? Anyone who has looked
through a stereoscope has experienced the difference between the volumi-
nous quality of these pictures in contrast to the flatness, or two-dimensional
quality, of ordinary pictures. So how, in light of this, could Berkeley maintain
that distance is not an attribute of our visual experience?
Berkeley, I think, would not have denied that the stereoscope scenes look
different or are experienced differently from single pictures. He was obvi-
ously aware that in ordinary vision we see distance better, and our experience
seems more voluminous, when we use two eyes. The reason is that in binoc-
ular vision we have powerful, additional cues, for example, conversion, to aid
in assessing distance. The stereoscope showed that there is one more cue,
binocular disparity, that could help. We have noted, too, that Berkeley did not
claim that our visual field was or looked planar. He says, in fact, that we will
derivatively describe as solid, not planar, those visual experiences that we in-
terpret three-dimensionally. Thus, since disparity enhances our appreciation
of distance, it is not surprising that visual experiences that include disparity
among their cues are described, derivatively, as being more voluminous.
Still, though the stereoscope experiments did not refute Berkeley’s posi-
tion, why were they taken by many to support his ideas, in particular, his
claim that vision lacks spatial properties? Here issues are more complex, and
I can only begin to sketch out the considerations that were operative. By the
time Wheatstone invented the stereoscope, perhaps the major schism in vision
research was over the issue of innateness. On one side there were those who,
like Berkeley, claimed that our spatial ideas were derived from sense experi-
ence. On the other side were those who saw themselves as heirs to the “Kant-
ian” tradition and were convinced that we could not acquire our ideas of space
by means of sense. Our ideas of space were an innate imposition of mind. Not
only vision, as Berkeley claimed, but our senses in general were thought to be
inadequate to supply us with our spatial framework. “[T]here is a quality pro-
duced out of the inward resources of the mind, to envelop sensations which,
as given originally, are not spatial. . . . This last is the Kantian view.”14 In turn,
distance perception was not, as Berkeley and others proposed, learned.
On just about every aspect of space perception debates raged over whether
the phenomenon was innate or acquired. The stereoscope experiments, how-
ever, were taken by many prominent researchers to support the “empiricist”
approach on several counts. Two are reasonably nontechnical and worth
mentioning here. First, various experiments were thought to demonstrate the
importance of learning in distance perception, hence challenging innateness
claims. Second, locating in retinal disparity an external physical base for the
fullness, or three-dimensionality, of our visual phenomena meant that it was
that much more reasonable to explain depth perception as dependent on sen-
sory apprehension. It was that much less plausible to assume that spatiality
was a nonsensory imposition of mind. The discovery of the stereoscope
“made the dogma of an innate intuition of space—of space as an inner con-
dition of all experience—less likely than ever before.”15
This is not to say that everyone in the “non-Kantian” camp agreed with
Berkeley that visual experience itself provided no basis for our spatial frame-
work. For example, Ewald Hering, Carl Stumpf, and William James agreed
with Berkeley that our idea of space is not an a priori imposition of mind, but
they rejected the claim that visual experience could play no role in the con-
struction of our spatial ideas. Most radically, James argued that all of our sen-
sations, including odor, taste, and sound, have a voluminous quality that can
serve as a basis for building our conception of space. Still, for James, as well as
most other theorists, distance is not a simple or immediate quality of visual
sensations. James’s claim is only that we can use this sensed voluminousness,
in conjunction with the variations in experience of objects as we move about,
to construct a visual idea of metric space. Moreover, for many researchers the
stereoscope experiments were seen to support Berkeley’s thesis about the

relevance of movement to our idea of space. For the experiments counted
against the view that binocular vision was special or peculiarly different from
monocular vision, where the importance of motion and touch were widely
taken for granted. “There can be no doubt that the fusion of the two visual im-
ages is the result of an act of mental association . . . [and that as is the case with
monocular vision] . . . in the binocular idea of depth it is sensations of move-
ment which furnish our primary measure of spatial distance.”16 Or as Herman
von Helmholtz saw matters, “The invention of the stereoscope . . . made the
difficulty of the Innate Theory more obvious than before and led to another
solution which approached much nearer to the older view. . . . This assumes
that none of our sensations give us anything more than ‘signs’ for the exter-
nal objects and movements, and that we can only learn how to interpret
these signs by means of experience and practice.”17 If historically the inven-
tion of the stereoscope is not taken to refute Berkeley’s claims about the non-
immediacy and imperceptibility of distance by vision, consideration of a
related issue can enhance our appreciation of Berkeley’s views concerning the
importance of our ideas of movement. The point here is that retinal disparity,
by itself, cannot provide information about the absolute distance of an object
from a viewer, nor can it, independent of such information, provide a mea-
sure of the absolute depth between two objects. The reason is that the amount
of disparity is a function of both the depth relations and the absolute distance.
Two objects close to each other in depth but near the viewer may project the
same disparity as two objects widely separated but further away. Disparity
measures may serve to recover absolute spatial depth only when conjoined
with a means of measuring absolute distance to scale the significance of
the disparity.
The geometrical features of the projection of light that prevent disparity
from providing independent information of absolute distance is not unique
to this cue. It has long been recognized that the pictorial cues cannot indicate
absolute spatial measures. This result is just the other side of the geometrical
considerations that underwrite the one-point argument. Of the traditional
cues only the nonvisual motor cues of convergence and accommodation
might seem to vary directly and unambiguously with distance. Given the
goal of accounting for how we locate objects in space, it is not surprising that
Berkeley attached special prominence to these cues.
Still, in order to evaluate absolute distance it is not enough to have a cue K

that varies directly and unambiguously with distance. In addition, we need a
scheme for assigning absolute-distance meaning to the values of K. We must
know how much distance goes with so much K. I think that an appreciation
of this problem plays an important role in Berkeley’s insistence on the need
for a scheme of visual-motor correlation. And although the issue has not re-
ceived all that much attention, the problem is a genuine one. As T.G.R. Bower,
albeit a recent critic of Berkeley, remarks, in real-life situations “to know
how far away an object is from us . . . the expression of how far must serve to
control behavior. . . . The term absolute distance serves as shorthand for ‘spa-
tial variables translated into a form appropriate for the control of spatial mo-
tor movements’.”18
Just how vision might come to provide such information, Bower argues, is
a difficult problem. Convergence, for example, varies with distance, but since
the distance between our eyes changes as we grow, the same convergence
angle will reflect different distances as we get older. In what way, then, might
convergence be calibrated so as to provide accurate distance information?
One theory of calibration that has gained some currency proposes that such
scaling results from correlating visual cues with movement.
Suppose that you are at some distance D from an object and then take a step toward it
so that the distance is reduced by the length ∆ of one step. . . . If the visual angle [a mea-
sure of the size of the retinal image] prior to the step is α1, [and] after the step . . . α2 . . .
[i]t can be shown that α2 /α1 = D/(D – ∆). Now, suppose that you register your own lo-
comotion in terms of an internal unit corresponding to the size of your pace [and] ∆
represents one unit of locomotion. . . . It follows that D = 1/(1 – α1/α2) [ paces]. [By
applying this calibration scheme,] distance to the object, expressed in terms of units of
locomotion, can be derived from the ratios of angular sizes of an object seen at two dif-
ferent distances. . . . merely by taking a step toward an unfamiliar object, it is possible
to compute the approximate number of paces that you would need to take in order to
reach the object.19
Now, although Berkeley might have qualms taking Kaufman’s equations to

describe actual mental computations, the importance of this sort of motor
scaling seems to me to lie at the heart of Berkeley’s stress on the tangible nature
of our distance perceptions. It is not just that behavior provides the ultimate
test of distance perception, as the Behaviorists might claim. For Berkeley, and
on Kaufman’s model, visual experience gains its distance significance via a
scheme of motor calibration. And as Kaufman says, echoing Berkeley, “[I]f
perceptual space . . . is scaled in terms of locomotion . . . [t]his has profound
implications for any theory of perception . . . [especially] how the senses work
together.”20 Berkeley’s views about the interrelations between the senses, how-
ever, are a story for another occasion. *[See chapter 5.]
Notes
This essay is excerpted from a much longer one on Berkeley’s views on distance per-
ception, which, in turn, constitutes the first chapter of my book Vision: Variations on
Some Berkeleian Themes (Oxford: Basil Blackwell, 1994). Phillip Cummins commented
on this essay at the University of Western Ontario’s conference on Berkeley’s Meta-
physics. I hope I have answered some of his questions in my book.
1. “Berkeley’s Theory of the Immediate Objects of Vision,” in Studies in Perception,

ed. Peter Machamer and Robert Turnbull (Columbus: Ohio State University Press,
1978), p. 332.
2. Perception (Englewood Cliffs, N.J.: Prentice-Hall, 1965), p. 43.
3. Berkeley (London: Routledge & Kegan Paul, 1977), p. 97.
4. Treatise on Physiological Optics, vol. 3, ed. James Southall, (New York: Dover,
1950), p. 6.
5. Human Knowledge: Its Scope and Limits (New York: Simon & Schuster, 1964), p. 51.
6. “The Question of Visual Perception in Germany, I,” Mind 9 (1878), p. 1.
7. For an account of many of these, see Julian Hochberg, “Perception, I and II,” in
Woodworth and Schlossberg’s Experimental Psychology, ed. J. Kling and L. Riggs (New York:
Holt, Rinehart & Winston, 1971), pp. 395–550.
8. “Three Kinds of Distance That Can Be Seen or How Bishop Berkeley Went Wrong,”
in Studies in Perception: Festschrift for Fabio Mettelli, ed. G. Flores D’Arcais (Milan:
Martello-Guinti, 1976), p. 83. It was Gibson’s own work that did much to challenge the
paradigm and assumptions underlying the traditional claim that distance perception
is not immediate.
9. Berkeley’s Theory of Vision (Melbourne: Melbourne University Press, 1960), p. 6.
10. Ibid., p. 5.
11. In other sections of the New Theory, Berkeley argues that the same holds for size,
shape, direction, and orientation. His claims in these cases, however, do not depend on
the one-point argument in the way his distance thesis does.
12. Berkeley himself does not deal with this problem in NTV.
13. Physiological Optics, vol. 3, p. 499. Again, it was not assumed that the average per-
son was aware of or could report on the intermediate sensations. *[More recent studies
demonstrating stereoscopic effects with pairs of “random dot” displays are a challenge
to these sorts of theories, because the forms seen with the stereoscope are not perceived
when either member of a pair is viewed by itself.]
14. William James, The Principles of Psychology, vol. 2 (New York: Dover, 1950), p. 252.
Whether James and other perceptual psychologists who cite or appeal to Kant correctly
understood the implications of Kant’s position for empirical theories of vision is a real
question. See Gary Hatfield, The Natural and the Normative (Cambridge: MIT Press,
1950), esp. chap. 3, for the claim that many theorists misunderstood the empirical
implications of Kant’s ideas. Hatfield further argues that Kant’s empirical claims about
vision and touch are much like Berkeley’s: “[Kant] makes vision depend upon touch
for its ability to perceive objects in depth, thereby implying the standard Berkeleian
account” (p. 105).
15. James J. Gibson, The Perception of the Visual World (Boston: Houghton Mifflin,
1950), p. 21.
16. Wilhelm Wundt, Lectures on Human and Animal Psychology, trans. J. E. Creighton
and E. B. Thorndlike (New York: Macmillan, 1896), p. 189.
17. “The Recent Progress of the Theory of Vision,” in Helmholtz on Perception, ed. R. War-
ren and R. Warren (New York: Wiley, 1968) p. 110.
18. Development in Infancy (San Francisco: W. H. Freeman, 1974), pp. 75–76.
19. Lloyd Kaufman, Perception: The World Transformed (Oxford: Oxford University
Press, 1979), p. 224 ff.
20. Ibid, p. 226.

Prescript 2
Chapter 2 is excerpted from the first few pages of chapter 2 of VVBT. That
chapter discusses Berkeley’s account of size perception and his criticism of
the “taking account of distance” (TAD) model. According to this model, the
visual system computes physical size by means of geometrical formula that
relates a measure of the magnitude of the retinal image to a measure of the per-
ceived distance to the object. L. Kaufman and I. Rock are important modern
proponents of the TAD model. In their influential paper on the moon illusion
they claim to refute Berkeley’s account. This selection contains a brief re-
sponse and defense of Berkeley.
Later in VVBT’s Chapter 2, little recognized problems with the geometric
assumptions underlying the TAD account put in doubt current versions of
the model. (For issues related to this critique, see Ross, H. & Plug, C., 2002, The
Mystery of the Moon Illusion: Exploring Size Perception. Oxford: Oxford Uni-
versity Press.) In chapter 7 of this volume, comparable concerns provoke re-
thinking the proper understanding of “occlusion” as a cue to depth.
2 Size
In sections 52–87 of New Theory Berkeley considers the question of size per-
ception. “[H]ow is it,” he asks, “that we perceive by sight the magnitude of
objects?”1 Although these sections raise important issues for the theory of
vision, they have received comparatively little examination.2 In part, this is
due to the fact that many commentators assume that the significant philo-
sophical points have already been raised in Berkeley’s discussion of distance
and that nothing new is to be found these sections. In part, it is also due to a
lack of appreciation of major aspects of Berkeley’s theoretical and empirical
claims and how they fit in with early and current work on size perception.
Some of the more recent neglect of Berkeley’s position, I think, may be traced
to a very popular paper by Lloyd Kaufman and Irvin Rock which appeared in
Scientific American.3 In this paper, Kaufman and Rock claimed to have refuted
Berkeley’s own account of the moon (size) illusion, while showing that the
taking-account-of-distance model (hereafter the TAD model) of size percep-
tion, which Berkeley opposed, is really the correct theory.4 The Kaufman and
Rock paper, however, can prove misleading on a few points. It does not take
into consideration Berkeley’s main criticism of the TAD model; nor does it
deal with one of the problems which Berkeley thought his own account could
solve better than the competing TAD theory.
What is “the” problem of size perception? The basic issue confronting the-
ories of size perception has continued to be conceptualized along much the
same lines as it was in Berkeley’s day.5 While the real, or physical, size of an ob-
ject is independent of its distance from an observer, the size of the image that
the object casts on the retina varies with the distance. Figure 2.1 sets out the
problem as it is typically presented in psychological works on size perception.
When an object of constant size h is moved further from the eye, its retinal
image decreases in size. The angle α which the object subtends, the visual
Figure 2.1
The size of the visual angle, α, of an object of size h varies with the distance d of the ob-
ject from the observer. *[With some simplifying assumptions; h = α × d.]
angle, is directly correlated with the image size. It is usual practice to talk
about the extent of the retinal image in terms of the size of the corresponding
visual angle.
The problem of size perception, then, is that of explaining our ability to
evaluate magnitude in light of the variability in the size of the visual angles
an object can subtend. Since it was widely assumed that the amount of our
sensed visual field (or, in Berkeley’s terminology, the number of minima visi-
bilia sensed) depends on the extent of the retina stimulated, our immediate
experiences of an object will vary when it is at different distances from us. A
nearby tower will occupy a large portion of our visual field, while the same
tower, viewed from half a mile away, will appear as a speck. Our everyday idea
of an object’s (constant) physical size cannot be identified with each of the
distinct visual ideas we immediately experience when viewing the object from
a variety of distances. Size perception involves a two-step mental process:
our immediate sensation, a function of the amount of the retina stimulated,
and our idea of a constant physical size that this sensation helps to trigger.
According to Berkeley, there is, moreover, no one visual experience that can
be singled out as the correct or veridical visual idea that goes with a given
spatial size.6
By what means, then, are the magnitudes of objects perceived by sight? For
Berkeley, visual extent and familiarity play a role, along with most of the
visual and oculomotor cues cited earlier in his account of the perception of
distance. We have learned to correlate these cues with “real” or tangible mag-
nitude. What is especially important about Berkeley’s model, however, is the
Size 31
way(s) in which it differs from that of the optic writers. The optic writers, too,
held that size perception was not immediate; but they championed a version
of the TAD model of size evaluation. According to this theory, we perceive size
on the basis of an initial or prior evaluation of distance. Given an apprecia-
tion of the visual angle and knowledge of the object’s distance, we can geo-
metrically compute its magnitude. *[According to the TAD model, the visual
system determines/registers the values for α and d, and on the basis of those
measures computes the size, h.]
Berkeley agrees with the optic writers that visual size perception is not im-
mediate, but he denies that it involves an initial determination of distance
and subsequent computation of magnitude based on this idea of distance.
Berkeley offers several reasons for rejecting the TAD model. First, he thinks
that introspection does not reveal the existence of processes of calculation
involving angles and distances. Allowing, however, for the vagaries of intro-
spection, this does not clinch the argument for Berkeley. Second, Berkeley
claims that the TAD model cannot account for certain empirical data as well
as his theory can. He spends a large part of sections 52–87 elaborating this
criticism. In particular, he believes that his own explanation of the moon
illusion, one of the most discussed puzzles in vision theory, is better than any-
thing the optic writers have to offer.
I mentioned earlier that Kaufman and Rock claim to have refuted both
Berkeley’s account of the moon illusion and his critique of the TAD model.
Berkeley had maintained that a primary reason for the moon illusion is the
presence of atmospheric vapor, or mist, between the observer and the moon
when the moon is on the horizon. It is the presence of these vapors, not
simply the presence of the terrain, that causes us to see the moon as larger
on the horizon.7 Kaufman and Rock claim that their experiments show that
Berkeley was wrong about the significance of mist and wrong in denying
the importance of the information that the terrain provides when looking
at the horizon moon. Two points missing from Kaufman and Rock’s article
render their remarks about Berkeley somewhat misleading. A major reason for
Berkeley emphasizing the role of mist was his concern to explain the differ-
ences in perceived size when viewing the horizon moon on separate occa-
sions. This is an issue that Kaufman and Rock do not really address. Clearly,
citing the presence of terrain cannot serve to distinguish these cases. Berke-
ley’s deeper complaint against the TAD model, though, was not over which
cues are the most prominent; rather, it was over the model’s account of the
processing that underlies size perception. Berkeley rejected the claim that
size perception depends on the prior evaluation of distance. He did not claim
that the standard “distance” cues do not play a role in the perception of mag-
nitude. On his own theory they do. What he challenged was the appropriate-
ness of labeling these cues “distance” cues, as opposed to calling them “size”
cues. According to Berkeley, the cues serve both functions, and they suggest
magnitude and distance evaluations in the same way. This is not merely a ter-
minological quibble. It marks Berkeley’s rejection of the TAD model’s pro-
posal regarding the processing steps that the visual system actually goes
through in determining size. It is to deny the “psychological reality” of a pro-
cessing stage that incorporates an explicit representation of distance and the
use of this measure to then compute magnitude.
Curiously, Kaufman and Rock point out a difficulty with their own theory
that may be seen to favor Berkeley’s approach. On their TAD account of the
moon illusion, the reason that the moon is said to look bigger on the horizon
is that it is mistakenly perceived to be further away than when it is up above.
Plugging this larger distance value into the formula we use to compute mag-
nitude yields a larger size evaluation for the horizon moon. A major problem
with this explanation, however, is that, if asked to judge the distance of the
moon, people tend to maintain that the moon is further away at its zenith
than it is on the horizon. Quite understandably, many theorists have taken
such distance evaluations to refute the TAD model of the moon illusion. Kauf-
man and Rock attempt to deal with this seeming contradiction to their the-
ory by arguing that although people do make these distance judgments,
these are not the judgments that the visual system relies on in making size
determinations. Such conscious distance judgments depend on an added bit
of “intellectual” reasoning, over and above the initial verdict that the visual
system itself supplies. Kaufman and Rock claim that our visual system really
does see the moon as further away on the horizon than when it is up above,
and that these distance evaluations are fed into the mechanisms of size per-
ception. The difference between these initial distance measures is what ac-
counts for the size illusion. Kaufman and Rock argue, however, that people
then go on to “reason” that since the moon looks bigger on the horizon, it
must be closer. It is such rationalizations that subjects report.8
In later works, Rock elaborates his own version of this position.9 He main-
tains that what gets used in size perception calculation is not the intellectu-
ally influenced distance value, but what he calls the “registered distance.”
Size 33
Rock waffles somewhat when it comes to spelling out what registered dis-
tance amounts to. On one reading, it is an unconscious representation of a
specific distance value. Often, though, he talks as if what are registered are
only the (distance) cues themselves, and that they directly influence size. But
if it is registered cues about distance, not a distance value itself, that play a
role, it would seem that Rock has gone a long way towards accepting one of
Berkeley’s central criticisms of the TAD model.
Notes
1. Berkeley, New Theory, sect. 52.
2. Margaret Atherton’s Berkeley’s Revolution in Vision (Cornell University Press, Ithaca,

N.Y., 1990) does discuss Berkeley’s views about size in some detail. My goal in this chap-
ter is less historical exegesis than the exploration of some problems concerning size
perception that are raised by a consideration of Berkeley’s ideas.
3. Lloyd Kaufman and Irvin Rock, “The Moon Illusion,” Scientific American, 207 (1962),
pp. 120–31.
4. Ptolemy is often-cited as the TAD model’s first proponent, and Helmholtz as its ma-
jor modern champion. Both these historical claims have been questioned.
5. The more recent, alternative, Gibsonian perspective is discussed in chapter 4 of VVTB.
6. See Irvin Rock, An Introduction to Perception (Macmillan, New York, 1975), pp. 71–3,
for some interesting remarks on this matter.
7. Berkeley, also points out that posture and angle of regard play a role. Angle-of-regard
theories have been and continue to be among the more popular explanations of the il-
lusion. Berkeley also allows that we ordinarily spend most of our time looking at ob-
jects situated on the ground and in the presence of other things. This too, he says, can
explain why the moon appears differently on the horizon than on the meridian.
8. For an update on where things stand concerning the moon illusion in general, as
well as discussion of the Kaufman and Rock solution, see Maurice Hershenson (ed.),
The Moon Illusion (Lawrence Erlbaum, Hillsdale, N.J., 1989).
9. See, e.g., Rock, Introduction to Perception, pp. 34ff.

Prescript 3
In contrast to the preceding selection, chapter 3 explores Berkeley’s account

of phenomenal magnitude and its measurement. At the heart of Berkeley’s
analysis of size perception is his notion of “minimum sensible.” Most com-
mentators find Berkeley’s writings on “minima” baffling, and where they do
claim to eke out sense from his proposals, they conclude he is hopelessly con-
fused or mistaken. I think the failure lies more with the critics than with
Berkeley, and this paper attempts to show why. Berkeley’s position is far less
puzzling when viewed from the vantage point of some modern approaches to
the study of phenomenal sense orders. Many of the supposed paradoxes and
inconsistences said to be found in Berkeley’s writings can be explained, if not
explained away.
3 Making Maximum Sense of “Minimum Sensibile”*
A proper understanding of Berkeley’s notion of “minimum sensibile” is

much in dispute. This is not be surprising, since Berkeley offers little in the
way of explanation of his conception. As Luce remarks regarding Berkeley’s
notes on minima sensiblia in Philosophical Commentaries, “Berkeley raises
several curious questions about them, showing himself convinced of their
existence, but not clear about their nature” (p. 141). Although minima sensi-
bilia play a significant role in Berkeley’s theory of vision, scholars vary in the
importance they attribute to them in his overall philosophy. Some pay min-
ima sensibilia minimal attention. Others see them as basic to Berkeley’s treat-
ment of geometry and physical space, and as crucial underpinnings of his
“esse is percipi” and Idealist doctrines. (See Moked 1988; Jesseph 1993.)
My goal in this paper is limited. I wish to elucidate and clarify the nature of
Berkeley’s minima sensibilia; in particular, the minima visibilia Berkeley ap-
peals to in his works on vision. I will not here explore the problems with or
implications my interpretation may have for Berkeley’s more metaphysical
and epistemological theses. Nor do I wish to claim that everything I say about
minima is explicit in Berkeley’s writings. For I agree with Luce, Berkeley was
not “clear about their nature,” and Berkeley’s own attempts to answer his “cu-
rious questions” are not particularly informative and may be inconsistent.
Berkeley, I believe, was struggling to forge a concept of “minimum sensi-
bile” out of contemporary psychological and mathematical ideas. Convinced
of their existence, he appreciated as well the problems to be faced. Yet he
did not have available the theoretical and formal tools needed to solve them.
Matters began to take firmer shape with the development of the field of sen-
sory psychology and with the embedding of problems about sensibilia in
the context of these studies. Morever, adequate treatment of some of Berke-
ley’s questions require the technical apparatus found only much later in
constructivist phenomenalist systems of people like Carnap (1928) and

Goodman (1977).
In his notes on the Philosophical Commentaries (especially 58 and 59), Luce
cites Berkeley’s main entries on minima sensibilia and raises a number of
puzzles with Berkeley’s views. I will use Luce’s statement of the issues as a
scaffold for my own explication.1 I think it helpful at first, though, to set
Berkeley’s ideas about minima sensibilia in the context of some of the later
developments and refinements mentioned above. Then, on the basis of this
account, I will try to come to terms with, if not answer, various of Berkeley’s
curious questions.
Although Berkeley pays almost exclusive attention to visibilia and tangi-
bilia, our olfactory, taste, and auditory systems also have their own peculiar
experiential or sensible qualities. The nature and order of these sensory qual-
ities differ from one sense domain to another. The empirical study of these
aspects of sensory experience, stripped of many of the philosophical worries
about qualia and sense data, finds a home in work on sensory ordering. In the
case of hearing, for example, the sensibiliia are experienced sounds. These
sounds can be ordered according to their phenomenal likeness. On various
accounts, the order of audibilia is two-dimensional, the dimensions being
experienced pitch and loudness.
In vision, the sensibilia are usually taken to be color experiences, and these
qualities too can be ordered according to their phenomenal likeness. The color
order is standardly said to be three-dimensional, every color characterized
by a unique triple of hue, saturation, and brightness. Visibilia may also be
described in terms of phenomenal location. If we move our eyes with respect
to a physically fixed object, we experience the object at different places in our
visual field. Alternatively, by making compensating shifts of gaze, we can
experience a physically moving object at the same phenomenal place. These
phenomenal places can be ordered with respect to their visual field locations.
For each sensory system, determining the properties and orders of its sen-
sibiliia are empirical matters, to be studied in the context of the relevant psy-
chology and measurement theory. Comparative judgements of subjective
experience usually provide the data for determining these phenomenal or-
ders. With color, subjects are typically presented with color pairs and asked to
compare them. In some paradigms, subjects are simply required to indicate
if the colors match. In others, subjects are required to rate how similar the col-
ors appear in hue, saturation, or brightness. In many systems, sensibilia that
Making Maximum Sense of “Minimum Sensibile” 39
match are considered identical in color. In other systems, matching does not
entail phenomenal identity. For example, it often happens that the pair A, B
match and the pair B, C match, but when presented together A is phenome-
nally distinguishable from C in color. Given this intransitivity of matching
judgments, it is possible to treat A and B as different phenomenal colors, even
though when compared directly they can not be told apart. Matching and
similarity judgements of various kinds also provide data for determining the
phenomenal place order of visibilia. The construction of both color and place
orders depend not only on subjective judgements, but on assumptions about
quality identity and the mathematical mapping conventions employed.
There are advantages and disadvantages associated with adopting these al-
ternative approaches, and the orders derived from them may differ in signifi-
cant ways.
On the basis of such orders, other concepts can be defined. For example,
two visibilia will be just noticeably different in color if no other color comes
between them in the order. Two visual field places will be minimally different
if there is no other place between them in the order. These orders also provide
a means for measuring likeness of colors or places. The degree of similarity
may be calculated in terms of the minimal path separating them in the order.
In systems where matching is distinguished from identity, colors or places
that match can have other colors or places lying between them in the order.
Relative to decisions about identity and individuation, questions can be
raised about the number of items in a sensory order. Consider again, the case
of color. The properties of surfaces or lights that go to determine their physi-
cal color vary continuously, and so the number of physical colors is often said
to be infinite. It is, however, typically assumed that only a finite number of
these physical differences will be discernable in experience. Others will fall
below the threshold detectable by means of the matching paradigm. Such
limitations on human color discrimination provide a basis for the claim that
the number of phenomenally distinct colors is finite. The situation is similar
with respect to visual field places. Although the places in the physical world
may form a continuum, the visual field places need not. On the assumption
that human sensory discrimination is limited, there may be only a finite
number of distinguishable phenomenal places.
Relative to a system of analysis, it is possible to measure visual field magni-
tudes as well. Assorted metrics can be used. One option is to take the entire vi-
sual field as the standard unit and measure phenomenal size as a percentage of
the whole. Another option is to take as the unit of measurement visual places
containing no other phenomenal place as part. These “atomic” places may be
considered the “minima visibilia” of the order. And phenomenal size can then
be specified in terms of the number of minimal places a visibile contains.2
All this is admittedly sketchy, and serious conceptual and technical prob-
lems remain. My aim, so far, has been to offer a framework for locating and
better understanding the issues Berkeley faced. In the following sections, I
will fill in more of the details.
Berkeley’s Minima Visibilia
In Philosophical Commentaries Berkeley characterizes MV as the “simplest,

constituent parts or elements” [70] of visual extension, “wherein there are
not contain’d distinguishable sensible parts” [439]. For Berkeley, MV play an
important role in describing visual phenomena. In particular, they serve as a
unit for measuring visual field magnitudes and visual field distances [256,
258, 469, 475]. “[D]istance signifies the number of intermediate ideas” in an
order [447]. Berkeley does recognize other metrics can be used, and in various
places mentions measuring phenomenal magnitudes as proportions of the
entire visual field, or as a proportion of the MV in the field [204, 213, 219].
Luce believes entries [175] and [296] indicate a commonality between
Berkeley’s MV and Locke’s sensible point (II, xv, 9) “which is ordinarily about
a minute, and to the sharpest eyes seldom less than thirty seconds, of a circle
whereof the eye is the centre.” Locke, however, says in this sentence that he is
characterizing, “a sensible Point, meaning thereby the least Particle of Matter
or Space we can discern.” Berkeley’s MV—whatever Locke had in mind—can-
not be defined as the smallest amount of physical matter or space we can see.
Strictly speaking, this makes no sense. Physical or tangible objects can not be
smaller, larger, or equal in size to MV. No number of MV is assignable to an
inch of space [87] or to any other physical object [256, 325]. The amount of
visual field an inch-long object occupies is a function of its orientation and
distance from the observer. Up close, it may fill the entire field; tilted from the
perpendicular it will occupy fewer MV than when straight up. Viewed from
a distance, it may have no experiential visual presence. Likewise, a speck of
sand, invisible to the naked eye, may occupy most of the phenomenal field
when looked at through a microscope.
The 30 seconds mentioned in [175] and [296] are a visual angle measure of
the image size an object projects from a specific distance and orientation.
Berkeley is best understood as agreeing that in humans “with the sharpest
eyes” a 30 seconds image may typically be the minimal needed to give rise to
visual experience. The image threshold will be larger for those with less acute
vision. As is clear from [218, 296], technically the threshold for experience is
not to be specified in terms of projected image size, but in terms of the mini-
mal extent of the retina that must be stimulated for perception to occur. The
same sized image will project to more or less of the retina depending on fo-
cusing features and the conformation of the eye. And Berkeley raises the issue
whether these may change with the distance the object is from the eye [296].
In [321] Berkeley asks why a minimum is difficult to imagine, and he an-
swers “because we are not us’d to take notice of ’em singly.” Nothing in a vi-
sual experience itself serves to delineate one MV from another within the
field. MV do not come marked with visible borders, nor are visible places ex-
perienced as having gaps between them. And in general there is no need to at-
tend to them individually, since “they not being able singly to pleasure or
hurt us thereby to deserve our regard” [321]. Berkeley’s definition of MV as ba-
sic perceptual or phenomenal elements is compatible with it being difficult,
if not empirically impossible, to have a visual experience of a single, isolated
visible place.3
MV, then, are best thought of as units of measure, developed for the pur-
poses of describing and ordering sensory phenomena in the visual domain.
Berkeley claims throughout the Philosophical Commentaries [343, 346, 438–
439, 462–464, 510], they are indivisible. This is not an empirical discovery;
rather it is built into the way the notion of a MV is specified in his system. By
definition, MV are the simplest place elements; they have no constituent
parts. A visual field extent composed of more than one phenomenal place is
not a MV. For Berkeley, too, our sensory systems are finite. Since there are lim-
its to the number of phenomenal places it is possible to distinguish in experi-
ence, there must be only a finite number of visible places.
MV are to be contrasted with the mathematical points found in geometry
[253, 344–345]. A phenomenal line is not infinitely divisible. As opposed to
the points on a mathematical line, there are only a finite number of MV on a
phenomenal line. Luce remarks (p. 140) that this aspect of Berkeley’s doctrine
“conflicts seriously . . . with the traditional geometry.” The claim that the
finite discrete geometry Berkeley proposes clashes with Euclidean geometry is

obviously correct.4 But if the geometrical properties of visual extension are dif-
ferent from those of tangible extension or physical space, the geometries best
suited to describe them may be different. These days, at least, there is no rea-
son to assume there can be only one acceptable geometry, and there is no prob-
lem in assuming that distinct geometries might apply to different spaces.5
In [66], Berkeley asks “whether MV be fix’d?” His answer in Philosophical
Commentaries is yes and no. In general, the extent of the visual field or the to-
tal number of MV it is composed of does not change “whether I look onely
in my hand, or on the open firmament” [169]. At the same time, it seems in
keeping with Berkeley’s position to allow that the visual field may contain
fewer MV when one eye is closed or there is damage to the visual system.
Berkeley does seem to hold the view that there is a correlation between the
amount of the retina stimulated and the amount of the visual field occupied
([213], [218], and [219]). Here as elsewhere in his writings on vision, Berkeley
joins many vision theorists, before and after, in accepting the so-called con-
stancy hypothesis, according to which there are a variety of correlations
between features of the retinal points stimulated and properties of the phe-
nomenal points experienced. (See Hatfield and Epstein 1979; Falkenstein
1994; Schwartz 1994; and Reading 5.)
Berkeley’s Curious Questions about MV
Q: Are they extended [273]?

A: MV are extended in the sense that each occupies a visual place.6 This does
not mean that they occupy a place or have extension in tangible or physical
space. Visibilia and tangibilia, however, are different from audibilia and other
sensibilia in their having phenomenal extension or place properties. [137,
241] “Extension seems to be a Mode of some tangible or sensible quality
according as it is seen or felt” [711]. “Several distinct Ideas can be perceiv’d
by Sight & Touch at once, not So by the other senses” [647].7 Only visibilia and
tangibilia have phenomenal place locations. Of course, the source or direc-
tion of sounds or smells may be localized in physical space, but there are no
audibile or olfactory sensory places at which they occur. Simultaneous sounds,
for example, are heard as a complex audibile, not as distinct audibilia at dif-
ferent places in a phenomenal place order [240]. Audibilia, in and of them-
selves, cannot be ordered with respect to their own phenomenal locations.
I think some of the controversy concerning the extension of MV results

from a failure to distinguish two different claims. When I maintain that a MV
has extension, I mean that it is assigned a phenomenal size of one. In turn, an
array of two MV has a size two, an array of three MV has a size three, and so on
for other arrays. Were MV to have a size zero extension, it could not be used in
this way to measure and compare phenomenal magnitudes (See Goodman
1977, p. 253).
There is, however, another way of talking about extension that differs from
this measurement analysis. On this account, nothing counts as extended un-
less it has extensional parts. The part/whole relation is constitutive of this
notion of extension, and MV do not have parts. As Berkeley says [167], “Ex-
tension, motion, time Number no simple ideas, but include succession in
them which seems to be a simple idea.” It is not possible to have an apprecia-
tion of extensional succession, however, without appropriate experience of
arrays consisting of at least two MV. Only arrays that occupy more than a
single place can have parts, and only arrays that have parts exemplify exten-
sion. The term “extension” applies to arrays that exhibit a succession of MV; it
does not apply to individual MV. So understood, a single MV is not extended.
With this distinction in mind, it is possible to eliminate a central sticking
point in the debate over whether or not MV have extension. In one sense they
do and in another they do not. MV have a unit magnitude; they all are of size
one extension. This is what allows them to serve as a measure of phenomenal
size. At the same time, a single MV is not an instance of the general idea of ex-
tension. It is not an array that has parts.8
If MV do have magnitude, there is a strong temptation to think that they
must have a shape. This has given rise to a series of seemingly unresolvable
puzzles about what that shape could possibly be.9 Inability to come up with a
satisfactory answer to this question has led some commentators to deny that
MV have extension. Others are perplexed to explain how, on the assumption
that MV have certain shapes, they can fill the entire visual field without leav-
ing gaps. This shape/gap problem is often illustrated by drawing MV as abut-
ting circles on paper and then expressing concern that parts of the paper
surface would remain uncovered, thus resulting in noticeable gaps in the
visual field.
Now Berkeley does not specifically discuss the issue of the shape of MV, and
there is reason to believe he may be on firm ground in not doing so. For the
solution to these puzzles, I think, is to deny that individual MV can have a
phenomenal shape. As critics of Berkeley have noted, talk of shape seems to

indicate the presence of distinguishable parts, but MV have no parts. In a typ-
ical constructivist system, however, shapes are defined as patterns of MV in
the sensory order. A single MV can not be assigned any shape.10 MV do oc-
cupy places, arrays of MV have shapes, and the visual field is constituted of
MV. The visual field has no gaps, not because the shapes of MV fit together
seamlessly, but because there are no voids to be experienced between a MV
and its nearest phenomenal neighbors. The analogy of MV to circles drawn
on paper does not make sense. It is mistaken to think of MV in this way as ly-
ing atop another distinct phenomenal background surface. MV make up the
entire visual field. There are no other places found in visual extension.
Q: Are they colored? [442]
A: Clarity here first requires paying attention to the distinction between sen-
sory quality types and experienced tokens. Although esse is percipi, there
need be nothing problematic in talking of phenomenal properties that are
not presently being experienced. The visual field may contain patches of
green at one time and no green at another. Similarly, it is not necessary to al-
ways have an experience at a phenomenal place in order to postulate the
place as a location of color qualia. It may, in fact, be possible to experience a
gapless visual field at a given time, while some visual field places are not ex-
perienced at that time. For, as indicated earlier, in some systems (such as
Goodman 1977), there will be phenomenal places in the order lying between
places that match.11
Berkeley acknowledges that it may be difficult to imagine a single MV, and
imagining (as opposed to conceiving or talking of) one without color is not
possible. Furthermore, in various works Berkeley says clearly that there is no
visible extension without color and no color without extension. Experienced
visibilia, and all parts thereof, must have a color. Color may be separated
from visual extension in thought [494], but if MV are extended they cannot
possibly be colorless. Puzzlingly, though, in [489] Berkeley suggests that the
issue is empirical: “Mem. to make experiments concerning Minimums &
their colours, whether they have any or no. . . .” The rest of the sentence may
provide a clue to resolving the seeming inconsistency, or it may at least enable
us to better appreciate the problems that concerned him. For Berkeley goes
on to say that the experiments may help determine “whether they [MV] can
be of that green which seems to be compounded of yellow & blue.” Berkeley
allows that some, perhaps all [151, 721], colors and color experiences are
compounds of more basic color elements—a view that had some currency
before and after he wrote. In the case he cites, in order to experience green
it might be necessary for there to be a mix of yellow and blue MV. Together
the MV would appear green, but no single MV could be green or be experi-
enced as green on its own [502].12 Thus Berkeley’s color compounding model
leaves room to ponder whether a MV may not have the color it appears to
have. (See also [242].)
Moreover, related considerations may have given Berkeley reason to think
there could be a need to accommodate the idea that single MV would be phe-
nomenally colorless. Berkeley notes [664] “Colours are not devoid of all sort
of Composition. tho it must be granted they are not made up of distinguish-
able Ideas. . . . Men are wont to call those things compounded in which we do
not actually discover the compound ingredients. Bodies are said to be com-
pounded of Chymical Principles whch. nevertheless come not into view till
after the dissolution of the Bodies. & whc. were not could not be discerned in
the bodies whilst remaining entire.” Experiments might establish that expe-
rienced compound colors require the contribution of more than a single phe-
nomenal place. Although all MV are experienced as colored, it might be best
to think of the individual MV that constitute an experienced compound as
not actually having the elementary composing colors and thus having no
color at all. Indeed, if as Berkeley suggests, all colors are actually compounds,
it might be necessary to assume that a single MV could not be experienced to
have a color independent of the contributions of neighboring MV. Since MV
may not be singularly experienced, however, this claim is consistent with the
idea that no MV can be perceived uncolored. Likewise, it would not prevent es-
tablishing a place order, since construction of a sensory order does not rest on
comparisons and judgements of MV isolated in experience.13
Q: Could sight be enlarged by diminishing the point [175]?
A: Earlier it was mentioned Berkeley agrees that a retinal image of 30 seconds
may be the minimal size needed to trigger a visibile. The 30 seconds are pre-
sumably the threshold for those with the sharpest eyes. His treatment seems
to allow, though, that if it took less retinal area to trigger a MV, the visual field
could contain more MV. This is what may distinguish acute and dull sight,
not a difference in the size of the MV itself as “others are apt to think” [250].
Note that this does not mean that use of a microscope diminishes the size
of MV or enlarges the visual field. A microscope alters the size of the image
projected and permits seeing smaller things. It does not change the retinal
threshold for triggering a MV. Nor does a microscope make the one and same
item appear physically bigger, since such size estimates depend on more than
visual field magnitude. (See chapter 2.) As we approach a tower, for example,
the visual image grows, yet the tower is perceived as being of a constant phys-
ical size. In a way a microscope exposes us to a different world. We may see
things we did not see before, tiny mites or gaps in a line. Could the visual field
be larger, though, if the retinal threshold for MV were less? The answer here is
yes [219]. In terms of total MV magnitude, the visual field could contain more
minimally discernable points. At the same time, the visual field will not take
in a wider span of physical space. It will only reveal the space in finer detail.
It is important to keep track of these distinctions when considering Berke-
ley’s discussions of comparative size differences of MV. By definition, MV are
least discernable places in a phenomenal order. As the basic units of measure,
all MV have measure one. So every creatures’ MV are of the same phenomenal
magnitude [272, 277]. “The visible point of he who has microscopical eyes
will not be greater or less than mine” [116]. Visual systems may differ, nonethe-
less, in the extent of the physical world they can take in at a glance, in the min-
imal area of retinal stimulus capable of triggering a MV, and in the amount
of the retinal surface a visual image of a given size will occupy (with different
conformations of eye [296]).
All claims about phenomenal magnitudes and visual field sizes have to be
understood relative to the conventions of the system of measurement em-
ployed, and as previously noted Berkeley seems sensitive to the issue. Mea-
suring phenomenal magnitude, not by MV, but as a proportion of the entire
visual field, yields different answers to the same questions. If the whole visual
field serves as the metric, then, by definition, visual fields do not differ in mag-
nitude. All visual fields will have the same unit size. In such a system, too, MV
need not be assigned identical magnitudes. The MV of fields composed of dif-
ferent total numbers of MV will occupy different proportions of the entire
field. Also using this metric, loss of retinal function will not diminish visual
field size. It will instead increase the proportional phenomenal size of the
least discernable places.
Paradox arises when these and related distinctions are not kept in focus. We
are lured, for example, into thinking there is a real fact of the matter as to
whether the MV of a person and that of a mite have the same phenomenal
magnitude in some more absolute sense. We picture superimposing a MV
from each and then seeing if one appears to extend beyond the other. But
Berkeley argues [272], strictly speaking, this situation is not really imagi-
nable. If we imagine one phenomenal place as extending beyond another, the

larger cannot be a MV. Nor can any real life version of the thought experiment
actually be conducted. The difficulty is not because it requires intersubjective
comparisons of experience. The picture breaks down in attempting to com-
pare the MV of a single person. MV do not occupy physical places and MV can
not be moved from one physical place to another. Nor is there any way they
can be superimposed on one another in phenomenal space, for MV are indi-
viduated by their phenomenal place locations.
True, Berkeley says it should be possible to test for retinal sensitivity. And
the results may show the retinal thresholds of two people differ, although
in a single glance they both take in the same span of physical space. But this
would not show their MV differ in magnitude, that the MV of the person with
the more sensitive retina must be smaller. As long as phenomenal magnitude
is measured in MV, the more MV there are in the visual field, the larger the
visual field is in size. So it does not follow that the MV of the person with the
lower retinal threshold are of less phenomenal magnitude.
Similarly, it should be possible to test if a person’s retina is uniformly sensi-
tive. It could turn out that it takes a smaller area of the retina to trigger a vis-
ible in one region than in another. Still, it does not follow that the resulting
MV differ in phenomenal magnitude. They all occupy one minimal place in
the visual field and have a phenomenal size of one. Berkeley seems aware that
visual acuity is better in the fovea. And some of his discussions of the lack of
clearness and distinctness of MV, which he talks about as faults of our visual
system, could be related to issues of retinal sensitivity.
These sorts of considerations might also help explain some otherwise
puzzling remarks in the PC. In [400] Berkeley asks; “if there be not two kinds
of visible extension, one perceiv’d by a confus’d view, the other by a distinct
successive direction of the optic axis to each point . . .” Here he does not seem
to be talking about alternative units for measuring magnitude—MV are the
unit. So how could there be two different measures of a scene depending on
how it is scanned? This makes good sense, if the retina is more sensitive at
the fovea. Scanning a surface bit by bit with the most visually acute part of the
retina will produce a larger total number of MV than are contained in the vis-
ibilia the surface produces when it is taken in as a whole in one glance.14 (See
also [284].)
Q: Can superior spirits see more points [than us]?
A: The issue is raised by Berkeley in [749] and [835], and he leans toward a pos-
itive answer. (See also NTV 84.) On the account given above, at least two claims
can be distinguished. First, the spirit’s visual field could be larger or more in-
clusive in the sense that it might take in a greater span of the environment in
one view. Second, the spirit’s visive faculty could be more sensitive, respond-
ing to images of smaller size. Both of these “perfections” might result in the
spirit seeing more MV than we do. Of course, the whole idea of the making of
such measures with spirits is a murky business, and understandably Berkeley
does little more than speculate. And as he remarks in [410], “God knows how
far our knowledge of Intellectual beings may be enlarg’d from the principle.”
In any case, the possibility of these perfections of the visive faculty should
not be confused with the expansion of vision a microscope provides. The lat-
ter does not increase the number of MV experienced.
Two-dimensional versus Three-dimensional Order
One complaint numerous critics have with Berkeley’s account of MV is his

treating the phenomenal place order of vision as two-dimensional. But it is
claimed that this position is untenable. We do not visually experience the
world as flat. In vision, we are immediately aware of depth. Hence, the phe-
nomenal field is three-dimensional, not a plane of MV.
Berkeley’s treatment of phenomenal place as forming a two-dimensional
order, however, can not be dismissed in this way. (See chapter 1 of this book
and Schwartz 1994.) The order of places in the visual field differs from the
order of places in physical space. There is no conflict in the former being two-
dimensional and the latter three. The phenomenal visual field and the phys-
ical world that vision can (mediately) bring to mind constitute two distinct
spaces. That we can tell distance by vision does not mean or imply that the
visual place order has three dimensions. After all, the fact that by hearing we
can tell the distance of sounds or by smell the direction of odors does not
show that audible and odor experiences have any phenomenal extension or
place dimensions.
In viewing a mountain through a window it is possible to distinguish the
phenomenal places from the physical places visually experienced. We see the
tip of the mountain as miles away from the top of the window frame, yet there
may be no or few discernable phenomenal places separating them in the
visual field. In the perceived physical order, the top of the window frame is
closer to the windowsill than to the tip of the mountain. In the phenomenal
order, the corresponding MV of the top of the frame are nearer those of the tip
of mountain than those of the sill.15 And it is the phenomenal visual place or-
der that is claimed to be two-dimensional. This does not mean, however, that
the visual field is physically flat. The visual field has no physical spatial di-
mensions; it is not “an orb, any more than a plain” [204]. (See also NTV 158.)
That we talk of length and breadth in both visible and tangible domains
does not mean they are commensurable in these properties. As Berkeley in-
dicates we talk of length in the auditory domain when measuring temporal
spread. And talk of distance whether it be between two points in a line or
as he says in [447], “between a slave & an Emperour, between a Peasant &
Philosopher, between a drachm & a pound, a farthing & a Crown . . .” always
“signifies the number of intermediate ideas” in an order. There is, of course,
an important difference that Berkeley recognizes. Among the sensibilia vis-
ible and tangible alone have extension and can be ordered with respect to
phenomenal locations. So it is possible to measure place distances and mag-
nitudes in these orders. Nevertheless, the visible and tangibile units of mea-
sure are qualitatively distinct and cannot be combined [70, 295]. Hence
Berkeley maintains there is no inconsistency with his heterogeneity thesis.
(See chapter 4.)
What’s more, Berkeley’s assigning of location to visual places with respect
to height and horizontal direction in the field was common in his day and has
not been abandoned by many of those who seek to describe phenomenal
place orders. The hypothesis, though, is not a priori. It depends, as all such
sensory measures do, on the nature of the stimuli, the workings of the sensory
system, and the individuation and mapping conventions employed. And for
many now, as for Berkeley then, a two-dimensional ordering of phenomenal
place has seemed most plausible, given the foundational empirical claim that
“distance being a line directed endwise to the eye, it projects only one point
in the fund of the eye, which point remains invariably the same, whether the
distance be longer or shorter” (NTV 2).
Notes
* I wish to thank Margaret Atherton and Peter Ross for discussion and comments.
1. All references in brackets are to Luce’s edition of Berkeley’s Philosophical Commen-

taries. I recognize that Luce’s transcription has been challenged, but since I structure
this paper around problems raised in his editor’s notes, it seemed best to stick with
his edition.
2. Goodman 1977 contains a detailed exploration of this network of ideas. I think it

illuminating to note that Berkeley’s minima visible correspond quite closely to the sen-
sory place atoms deployed in his phenomenalist constructions.
3. Similarly for Berkeley, although phenomenal time is not infinitely divisible, we do

not experience gaps between temporal instances [4–8, 167, 460]. Nor are we likely to
experience an isolated time instant.
4. See Jesseph 1993 for Berkeley’s problems dealing with these matters.
5. It is important to keep in mind throughout that questions about the structure and
organization of the phenomenal visual field are to be distinguished from questions
about the geometrical properties of the visual world (in other words, the physical envi-
ronment as revealed by vision). See especially the discussion of two-dimensional ver-
sus three-dimensional place orders in the last section of this paper.
6. For alternative readings, see Bracken 1974, Raynor 1980, and Jesseph 1993.
7. Hume, too, uses simultaneity to argue that extension is a property of sight and
touch and only them.
8. I intend to elaborate on this analysis of the two senses of extension in a paper on

Hume’s account of minima.
9. See Armstrong 1960, Gray 1978, and Jesseph 1993 for such concerns.
10. Also see [365]. Similarly, note that a mathematical point does not have a specifiable
geometric shape within the system, although two or more points have/determine a
shape. (See Goodman 1977, p. 252.) I am not claiming that Berkeley offered this an-
swer to the shape problem or even considered it. See chapter 4 of this book for a discus-
sion of phenomenal visual shape.
11. This is not something Berkeley is likely to have contemplated.
12. One might think here of an analogy with the color dots that constitute a television
display. Although we see a gamut of colors, the actual screen pixels are of just three
hues. None of the compound hues is to be found or seen in any single pixel.
13. Much needs to be explored concerning Berkeley’s numerous remarks on color in

PC and how, if at all, they jibe with his positions elsewhere. For example, in discussing
color compounding, Berkeley seems, uncharacteristically, to accept the idea that a
compound cause entails a compound effect [562].
14. Luce (1989) and Falkenstein (1994) offer readings of this quote that differ from
mine. I find theirs less satisfactory because I do not think they can explain either why
the first type of perception is confused or why the second type is a kind of visual exten-
sion, measured in terms of MV rather than MT.
15. Such discrepancies occur whenever we perceive a physical edge. At the visual edge
there are no places between the edge and that which is on the other side of the edge. In
physical space there are physical spaces between them.
References
Armstrong, D. M. 1960. Berkeley’s Theory of Vision. Melbourne: Melbourne Univer-

sity Press.
Berkeley, G. 1989. Philosophical Commentaries. G. Thomas (ed.). New York: Garland Press.
Berkeley, G. 1948. An Essay Towards a New Theory of Vision in The Works of George Berke-
ley. Volume 1 A.A. Luce and T.E. Jessop (eds.). Edinburgh: Thomas Nelson.
Bracken, H. M. 1974. Berkeley. New York: St. Martins.
Carnap, R. 1928. Der logische Aufbau der Welt. Berlin: Weltkreis Verlag.
Goodman, N. 1977. The Structure of Appearance. Indianapolis: Boobs-Merrill.
Gray, R. 1978. “Berkeley’s Theory of Space.” Journal of the History of Ideas 16, 415–434.
Falkenstein, L. 1994. “Intuition in Berkeley’s Account of Visual Space.” Journal of the

History of Philosophy 32, 63–84.
Hatfield, G. & Epstein, W. 1979. “The Sensory Core in the Medieval Foundations of
Early Modern Perceptual Theory.” Isis 70, 363–84.
Hume, D. 2000. A Treatise of Human Nature. D. Norton and M. Norton (eds.). Oxford:
Oxford University Press.
Jesseph, D. 1993. Berkeley’s Philosophy of Mathematics. Chicago: University of Chi-

cago Press.
Locke, J. 1979. An Essay Concerning Human Understanding. P. Nidditch (ed.). Oxford:

Oxford University Press.
Luce, A. A. 1989. “Explanatory Notes” in G. Berkeley, Philosophical Commentaries.

G. Thomas (ed.). New York: Garland Press.
Moked, G. 1988. Particles and Ideas: Bishop Berkeley’s Corpuscularian Philosophy. Oxford:
Clarendon Press.
O’Shaughnessy, B. 1980. The Will: A Dual Aspect Theory, Vol. 1. Cambridge: Cambridge
University Press.
Raynor, D. 1980. “Minima Sensibilia in Berkeley and Hume.” Dialogue 19, 196–200.
Schwartz, R. 1994. Vision: Variation on Some Berkeleian Themes. Oxford: Blackwell

Publishers.
Prescript 4
This selection offers a new twist on Berkeley’s views concerning common sen-
sibles and the heterogeneity of the senses. An understandable complaint
about this interpretation is that it is not one Berkeley would find palatable. I
am not convinced this is so. I think the analysis makes better sense of his over-
all commitments and theories than more standard readings. I do not doubt,
however, that one can find passages in Berkeley at odds with points in my ac-
count. On the other hand, I believe his heterogeneity arguments are more
consistent and compelling in the interpretation proposed. The aim of this se-
lection, though, is to explain Berkeley’s position, not defend it.
4 Heterogeneity and the Senses*
By all accounts Berkeley’s heterogeneity doctrine plays a major role in his

philosophical thinking. Evaluations of Berkeley’s claims and arguments run
the gamut from being declared largely on target to being deemed incomplete
and inconsistent, if not incoherent.1 There is not even agreement on whether
his heterogeneity doctrine is an empirical hypothesis or a conceptual thesis.
In turn, Berkeley’s negative answer to the Molyneux problem has been ana-
lyzed and attacked from both of these perspectives.2
It would be impossible to examine the voluminous Berkeley scholarship on
these matters, and given the wide range of interpretations, any response to
one critic is likely to be unresponsive to the concerns of others. I propose, in-
stead, to offer my own account of what Berkeley was up to when he claimed
that there are no ideas common to sight and touch. I say “my own account”
not because the individual points made are necessarily new with me. The
interpretation I offer, however, is dependent on the treatment of sensory
orders and measurement developed in Reading 3. A brief synopsis of that
paper follows.
Sensory Minima
Berkeley’s appeal to the idea of sensory minima, I argue, is best understood

in the context of his attempt to provide a psychologically plausible basis for
describing and analyzing sensory experience [80]. As is to be expected, his
ideas were both influenced by and in response to then: prevalent assump-
tions and accepted theories about sense experience and the workings of
sensory systems. Obviously, too, Berkeley did not have available the mathe-
matical and logical tools necessary to provide a sound, rigorous treatment of
sensory ordering.
Although Berkeley pays almost exclusive attention to visibilia and tan-

gibilia, our olfactory, gustatory, and auditory systems also have their own
peculiar experiential or sensible qualities. The kind and orderings of these
sensory qualities differ from one sense domain to another. The empirical
study of these orders finds a home in work on sensory measurement. In the
case of hearing, for example, the sensibilia are experienced sounds. These
sounds can be ordered according to their phenomenal likeness. On various
accounts, the order of audibilia is two-dimensional, the dimensions being
experienced pitch and loudness.
In vision, the sensibilia are usually taken to be color experiences, and these
qualia too can be ordered according to their phenomenal likeness. The stan-
dard color order is said to be three-dimensional, every color quality charac-
terized by a unique triple of hue, saturation, and brightness. Visibilia may
also be described in terms of phenomenal location. If we move our eyes with
respect to a physically fixed object, we experience the object at different
places in our visual field. Alternatively, by making compensating shifts of
gaze, we can experience a physically moving object at the same phenomenal
place. These phenomenal places can be ordered with respect to their visual
field locations.
For each sense modality determining the properties and orders of its sensi-
bilia are empirical matters, to be studied in the context of the relevant psy-
chology and measurement theory. Comparative judgments of subjective
experience usually provide the data for determining these phenomenal or-
ders. With color, subjects are typically presented with color pairs and asked to
tell if they match. In other experiments subjects are required to rate how sim-
ilar the colors appear in hue, saturation, or brightness. Matching and simi-
larity judgments may also be used to determine the phenomenal place order
of visibilia. The construction of both color and place orders depend not only
on subjective judgments, but on assumptions about quality identity and the
mathematical mapping conventions employed. These orders provide a means
for measuring likeness of colors and places. The degree of similarity is calcu-
lated in terms of the minimal path separating items in the order. Treatment of
the experienced qualities of the other sensory modalities follows along the
same lines.
Berkeley claims, however, that vision and touch are different from the
other senses. Only visibilia and tangibilia have phenomenal place locations.
Thus only visiblia and tangibilia can be ordered with respect to their own ex-
Heterogeneity and the Senses 57
tensions. Of course, the source or direction of sounds or smells may be local-

ized in physical space, but there are no audibile or olfactory sensory places at
which they occur. Simultaneous sounds, for example, can be heard as coming
from different places in the environment, but they are not experienced at dif-
ferent locations in a phenomenal place order of sound. Audibilia, in and of
themselves, have no phenomenal extension. They cannot be ordered with re-
spect to their own phenomenal locations, for they have none. Visual and tan-
gible experience, on the other hand, can be so ordered, and the nature of the
order fixes the dimensions of their inherent extensions. Berkeley’s claim that
visible extension is two-dimensional is meant as a description of the visual
place order. It is not a claim that the visual field is physically or tangibly flat
[157]. The visual field has no properties or qualities in physical space. It makes
no sense to attribute such spatial dimensions to it.
The Heterogeneity Doctrine
In [127], Berkeley claims “The extension, figures and motions perceived by

sight are specifically distinct from the ideas of touch, called by the same
name; nor is there any such thing as one idea, or kind of idea, common to
both.”3 The ideas that constitute the sensory experiences of each sense realm
are distinct. The same idea cannot be found in different modalities. On one
reading this claim seems straightforward and not especially controversial. A
visual experience of color, an audible experience of sound, and a gustatory
experience of taste are qualitatively distinct sensations. They do not resemble
nor match one another phenomenally. Nor do the phenomenal qualities of
one necessarily imply what experience in the other modalities will be like. Al-
though our experiences of a yellow color, citrus smell, and tart taste may be
due to the same lemon, the experiences per se are not similar or alike in char-
acter. The same holds for the tactile experiences the lemon may afford. Touch
sensations of resistance are incommensurable with the color, sound, and
taste experiences the other senses supply. Of course, if these lemon experi-
ences become associated with one another, then seeing the lemon may bring
to mind these ideas of smell, taste, and touch.
In discussions of Berkeley’s heterogeneity doctrine most everyone agrees
that colors, tastes, sounds, and tactile sensations do differ qualitatively.4 The
sticking point is Berkeley’s further claim that the experiences of the different
senses have no idea or kind of idea in common. It is here that doubts about
the doctrine arise, and they arise primarily with respect to spatial properties.
The experienced color of a lemon may not be comparable to the experienced
resistance of its surface, but it is maintained that the situation is quite differ-
ent when it comes to properties like shape. The visual and tactual experience
of the lemon, for instance, resemble each other or share the property of having
an ovoid shape. Shape, then, seems to be a clear case of a common sensible.
So it is argued, neither Berkeley’s claim that the senses have no ideas in com-
mon, nor his negative answer to Molyneux are justified.
Number
A more careful examination of NTV, I think, shows that this refutation of the
heterogeneity doctrine moves too quickly. Consideration of Berkeley’s treat-
ment of “number” can help explain why. Berkeley insists that enumeration or
assigning cardinality to things always presupposes a sortal. “We call a win-
dow one, a chimney one, and yet a house, in which there are many windows
and many chimneys, hath an equal right to be called one, and many houses
go to the making of one city” [109]. It makes no sense simply to ask “How
many?” or to compare cardinality without specifying how many of what is
being counted. Moreover, arithmetic operations on numbers are only well
defined where a common unit is assumed. One house plus ten windows does
not sum to eleven.
Now Berkeley is perfectly willing to assign cardinalities to sets of sensory
items. He has no problem with reports that someone experienced two visible
color patches of yellow, two audible sounds of C sharp, and two distinct pres-
sure sensations, at the same or different times. At first blush, then, it might
seem that Berkeley’s heterogeneity thesis faces a difficulty. Experiences from
different senses can share a property, and an abstract property at that. In the
case above, sensory arrays from distinct modalities share cardinality or have
in common the property “two-ness.”
But no amount of blushing will make it plausible that Berkeley would find
such an observation a serious challenge. The fact that experienced ideas in
separate sensory realms share number, does not mean that they are qualita-
tively alike or resemble each other. As noted, everyone agrees that experi-
ences of a single yellow patch, a single citrus odor, a single tart taste, and a
single area of felt pressure are qualitatively distinct sorts of sensory experi-
ence, even though each is an instance of “one-ness.” Such similarity of cardi-
nality is not taken to imply that the idea of one-ness is a common sensible.
There seems little reason to think, however, that the situation changes sig-
nificantly when the sensations in each modality come in pairs or share higher
cardinalities. Number is not the kind of property that characterizes a consti-
tutive attribute of sensations. It is not a dimension along which sensations are
compared and ordered when characterizing and mapping their experiential
qualities within a sense realm.
As Berkeley says, number depends on the mind making a “perfectly arbi-
trary” choice of the units of enumeration, and this choice is constrained only
by considerations of what is “most convenient” for the task at hand [109]. Al-
though experiences of sight and touch can match in number, the sensations
constitute different sensory domains. In an ordering of color (such as the color
sphere) there are no pressure sensations any more than there are sounds. It
does not make sense to add two color patches to two pain sensations or to two
C sharps, since the sortals are different. Arithmetic operations can only be
employed when a common unit is set. Still, certain comparisons of number
within or across sense realms may have uses. Faced with the need to count the
number of items in one sense domain by means of markers in another, it
would be (psychologically) natural to correlate two color patches with two
sounds, three color patches with three sounds, and so forth. Thus it may be
said that 2 C sharps are fitter to represent two yellow patches, than one, three,
or some other number of C sharps. But there is no necessary connection be-
tween seeing yellow patches and hearing C sharps, and there is no way reason
alone can deduce cardinality assignments in one realm from those in another.
Perhaps number is special, since it is not strictly speaking a sensory prop-
erty. The fact that the argument against the heterogeneity doctrine does not
go through with number does not preclude there being other common prop-
erties that are relevant. Indeed, challenges to heterogeneity usually focus on
spatial properties, such as distance, size, and shape.
Distance and Size
With slight alterations, the case of distance is amenable to an analysis along

the lines given for number. Berkeley is quite clear that it is possible to measure
phenomenal distance in both visual and tangible extension. “For by distance
between any two points, nothing more is meant than the number of inter-
mediate points: If the given points are visible the distance between them is
marked out by the number of interjacent visible points: If they are tangible,
the distance between then is a line consisting of tangible points; but if they
are one tangible and the other visible, the distance between them doth nei-
ther consists of points perceivable by sight nor by touch, i.e. it is utterly in-
conceivable” [112]. Berkeley’s idea of distance is unambiguous and generic.
Distance is the number of points or places between two points in a phenom-
enal order. So defined, the predicate ‘distance’ can be applied to both sight
and touch. Ideas of sight and ideas of touch, nonetheless, are heterogeneous
and incommensurable. Adding minima visiblia (MV) to minima tangiblia
(MT) is inconceivable. There is no way to sum or apply mathematical opera-
tions to different unit measures.
Berkeley, in fact, has a quite sophisticated conception of order and measure
in sensory realms. His idea of phenomenal distance is not specifically limited
to sight and touch. Were a new sense modality to turn up and have its own
phenomenal extension, his definition of distance would be applicable. In ad-
dition, for Berkeley the concept of distance applies not only between places
in extension but to other phenomenal orders. Two colors, for example, can be
measured for the distance between them in an ordering of colors (such as the
standard color sphere mapping of color experience). Berkeley also notes that
his abstract idea of distance can be applied to non-sensory orders. He says in
Philosophical Commentaries [447], “A line in abstract or distance is the number
of points between two points. There is also distance between a Slave & an Em-
perour, between a Peasant & Philosopher, between a drachm & a pound, a
farthing & a Crown etc in all which distance signifies the number of interme-
diate points.”
It should be apparent, though, that distance in phenomenal extension is
just a particular case of magnitude measurement. It is the size between two
points in an array. (See Schwartz 1994 for implications.) Since magnitude is
the more general concept it will simplify discussion to focus on it. Suppose, as
Berkeley proposes, visual size and tangible size can in principle be measured,
employing as units the minima sensibilia (MS) characteristic of each sense.
Parts and wholes of visual arrays are measured in terms of the number of min-
ima visiblia. This array may contain 200 MV and that one 400 MV, and the to-
tal combined visual size is 600 MV. Similarly, the size of tactile arrays can be
tallied in terms of the number of minima tangibilia that compose them. It is
most important to keep in mind throughout this discussion that we are talk-
ing about phenomenal size, a magnitude measure of sensory experiences.
These size measures are not properties of the physical objects that may be
their source.5 Berkeley takes pains, for example, to remind us that a physical
inch has no single visual size [61]. Up close it may occupy the entire visual
field. As it moves away, its visual size diminishes. Eventually the inch-long ob-
ject can no longer be seen. It has no presence in visual experience. No visual
field places are occupied.6 By contrast, viewed under a microscope a small seg-
ment of an inch may occupy the entire visual field.
Although the magnitude of visual arrays and tactual arrays can both be
measured in terms of their respective minima sensiblia (that is, MV and MT);
so enumerated their sizes are incommensurable. They are not amenable to
arithmetic operations: 200 MV plus 200 MT do not sum. Most significantly,
it is incorrect to assume that a visible size of 200 MV is equal or equivalent to
a tangible size of 200 MT. It is meaningless to assign a phenomenal visual size
measured in MV to a tangible array measured in MT, and vice versa. There is
no common experiential field or area of phenomenal place that both 200 MV
and 200 MT can coherently be said to occupy to the same extent.
Berkeley makes his views about the connection between units and hetero-
geneity quite clear in [131]. It is “an axiom universally received that quanti-
ties of the same kind may be added together and make one intire sum. . . .
kinds of quantity being thought incapable of any such mutual addition, and
consequently of being compared together in the several ways of proportion,
are . . . esteemed intirely disparate and heterogeneous. . . . Now let anyone
try in his thoughts to add a visible line or surface to a tangible line or surface
so as to conceive them making one continued sum or whole. He that can do
this may think them homogeneous: but he that cannot, must by the forego-
ing axiom, think them heterogeneous.”
When the unit used to measure both sense realms is the more general sor-
tal, minimum sensible, the situation is different. One may add 200 MS (visi-
bilia) and 200 MS (tangiblia), but the 400 MS total does not characterize size
in either sensory modality. In fact, 400 MS is not the measure of the experi-
enced magnitude of an array in any sensory order. It is not “one continued
sum or whole.” Thus it is questionable what use there would be for such a
tally. Of course, it is possible in principle to compare the number of MS in a
visibile to the number of MS in a tangible and conclude that the arrays con-
tain the same or different number of MS. And were there a need to keep tabs
on the size of tangibile arrays using items from the visual field, it would un-
doubtedly be more convenient to have larger arrays of MV represent larger
arrays of MT. It would be simpler, and perhaps even more useful, if ratio prop-
erties of the orderings are also preserved. For example, if one tangible array
is twice the size of a second, the size of the visible arrays representing them
should also be in a two to one ratio. These schemes would be pragmatically
fitter than arbitrary correlations. But “fitter” is not meant to imply that 200
MV resemble, match up better, or have a necessary connection to 200 MT. Like-
wise, preservation of ratio relations between these magnitude measures does
not show that arrays of sight and those of touch phenomenally resemble one
another. The two extensions are incommensurable.
In turn, the length or size of a physical object can not be determined by
summing the MS from both sense realms. Adding the MV of the experience of
the right half of a physical rod to the MT experience of its left half does not
give a coherent measure of its length [131]. Nor is there any reason to assume
that the number of MV will be the same as the number of MT. Although the
number of MT of a physical inch may be fixed, there is no unique number of
MV that can be assigned to that physical length. Depending on the viewing
distance and angle, the visually experienced inch may occupy the whole vi-
sual field, a single point, or any number of MV in between these extremes.
Shape
Berkeley’s denial that shape is common to vision and touch and his negative
answer to the Molyneux question are usually thought to be the least tenable
strands of his heterogeneity doctrine. Although comparisons of number,
either as measures of cardinality or size, may not be relevant dimensions
along which to evaluate the thesis, properties of shape seem to be another
matter. From Berkeley’s perspective, however, the difference between shape
properties and size properties is not one to challenge his heterogeneity doc-
trine. For Berkeley correctly points out that “figure is the termination of mag-
nitude” [105, 124].
In principle, shapes can be defined according to the distribution of relative
sizes fixing figure boundaries. For example, a phenomenal array that is both
closed and bounded by three straight lines is triangular. And if the array is visual,
it is then a visual triangle. A visual shape property, of course, can not be equated
with any single set of visual magnitudes. Shape is a structural feature of an array.
Visual arrays having different overall magnitudes can share shape. The same is
true of physical shapes; they too are structural properties and come in all sizes.7
Experienced visual shape does alter with tilt of the physical object off
the fronto-parallel plane or with changes in the observer’s angle of regard.
A tangible triangle will produce differently shaped distributions of MV, de-

pending on its distance from and orientation to the observer. Conversely,
non-triangular physical items can produce visual field arrays that are trian-
gular in phenomenal shape. Every visual field shape is inherently ambiguous
in that it can be a projection from an unlimited number of distinct tangible
shapes and vice versa.8 The visual shape experiences of a physical object are
variable; its tangible (physical) shape is stable. Therefore, no visual shape
property is identical to a tangible shape property, nor is there any necessary
connection between them.
Geometric operations can not be meaningfully employed on shapes from
distinct sensory realms. A phenomenal visible shape array can not be super-
imposed on a phenomenal tangible shape array to determine if they are con-
gruent or similar. Nor is it possible to place a visible circular array atop a
tangible circular array to produce an experienced figure eight. There is no
single phenomenal space (extension) to house the results of such operations.
Cross-modal combinations of shape make no sense. The fact is, visual and
tangible shape are incommensurable.
This analysis of phenomenal shape, I believe, can serve to defuse the oft-
made charge that Berkeley is inconsistent in his discussion of shape. In [141],
Berkeley says that the visual square is fitter than the visual circle to represent
the tangible square, because it, like the tangible square, contains several dis-
tinct parts—four sides and equal angles. The visual circle does not. He sug-
gests that this difference is why the visual square is fitter than the visible circle
to represent the tangible square. But there is no inconsistency or incompati-
bility here with Berkeley’s heterogeneity doctrine or with the analysis of sen-
sory experience found earlier in NTV.
The ideas “square,” “four sides,” and “four angles” as specifying structured
distributions of magnitude can and do apply to square arrays in both visual
extension and tangible extension. Such shape concordances, though, do not
mean that the phenomenal experiences of the two modalities resemble each
other. As Berkeley says, “it will not hence follow that any visible figure is like
unto, or of the same species, with its corresponding tangible figure, unless it
be also shewn that not only the number but also the kind of the parts be the
same in both” [143]. This, Berkeley argues, cannot be shown, since “no vis-
ible magnitude having in its own nature an aptness to suggest any one par-
ticular magnitude, so neither can any visible figure be inseparably connected
with its corresponding tangible figure” [105]. Sensations of sight and touch
are qualitatively distinct and their incommensurability is made evident by

the fact that mathematical and geometrical operations between items from
different modalities are not meaningful.
A Puzzle
On this reading, major elements of Berkeley’s heterogeneity doctrine are not

only consistent and coherent, but quite reasonable in light of his and then-
current understandings of the nature and operation of sensory systems. His
doctrine of heterogeneity need not depend on obscure or implausible as-
sumptions about the notions of “resemblance,” “kinds,” or “sorts.” Neverthe-
less, a pressing puzzle does remain. How should we interpret Berkeley’s claim
that the heterogeneity of visual and tangible senses amounts to or entails that
“there is no such thing as one idea, or kind of idea common to both senses” [127].
On a straightforward reading, this latter claim appears problematic, if, as I
have just argued, Berkeley’s own account must allow that even properties of
shape can unambiguously apply to both visual and tangible arrays. A phe-
nomenal line is definable as the shortest path between points in a place array,
a phenomenal circle may be defined as a closed array of sensory places whose
perimeter points are all equidistant from a given point, and we saw above
how to specify phenomenal triangles. These shape ideas correctly character-
ize both visual and tangible arrays.
Berkeley is well aware that applicability of common shape predicates across
modalities will raise questions about his heterogeneity thesis. His answer is
that such dual usage of terms is a result of metaphorical ambiguity. For ex-
ample, Berkeley argues that the term “high” does not have the same meaning
when applied to thoughts and to distances from the earth [94]. It is easy, in
this case, to agree with Berkeley that there are two distinct ideas associated
with high and no single sense in which the term applies in both contexts. A
claim of metaphorical ambiguity, however, is more difficult to accept with
the terms “triangle,” “line,” and “circle” as I have defined them. These shape
predicates have the same structural definitions when denoting visual or tan-
gible arrays.
Does this then show that there are unambiguous ideas of shape common to
both senses? If not, why not? Alternatively, if the existence of such shape
properties does not count as a counterexample to Berkeley’s claim that sight
and touch share no ideas, what properties possibly could? Answers to these
questions depend on distinguishing, as Berkeley does, between general terms

and problematically abstract ones. Berkeley admits that we can have the con-
cept “triangle” and that the term “triangle” can be applied to distinct kinds
of triangles (for example, right, obtuse, equilateral). I have explained why and
how it can, in addition, encompass triangular figures in both sight and touch.
So employed, the triangle concept is neither ambiguous nor abstract. It is
generic. Similarly, the terms “line” and “circle” can function as general terms
when applied either within a sense modality or to arrays in distinct sensory
realms. But such generic use is compatible with the claim that the items de-
noted in one sense realm neither resemble nor are necessarily related to those
in the other.
Berkeley’s use of the concept of minimum sensible can be cited to bring
home the point. Minimum sensibile is a general idea that unambiguously
denotes both minima visibilia and minima tangiblia. No MV resembles or is
necessarily related to a MT. Yet Berkeley surely does not believe that the con-
cept of a minimum sensible is itself inconceivable or unimaginable. It is a
meaningful generic idea applicable to both visual and tangible items that are
themselves neither similar nor related by reason.
Use of the concept “extension” is amenable to similar treatment. When de-
fined as an array of places in a sense order, it is an unproblematic notion, hav-
ing a univocal general meaning applicable to sight and touch experiences.
Were a new sense modality discovered that had a phenomenal place dimen-
sion, the concept “extension” would readily take in these experiences as well.
This generic idea of extension is a property of experience and has sensory
content. Vision and touch have extension by virtue of each having a place
dimension in their sensory orders.
Acknowledgment of a generic idea of extension is not inconsistent with
Berkeley’s so-called inseparability thesis. A single MV or MT has magnitude
and arrays of MV and MT exemplify extension, but no visible place can be ex-
perienced without a color, and no tangible place can be experienced without
a sensation of resistance.9 These properties, nonetheless, are separable from
extension in thought. We can talk meaningfully about sensory places without
assigning them either any color or felt resistance. The generic concept mini-
mum sensible does just this. Berkeley’s inseparability thesis need not be seen
as a deep metaphysical or epistemological doctrine. It is at root a comment on
the character of phenomenal experience and the qualities of sensations found
in different sense realms.10 For Berkeley, there is no comparable question
whether sounds, tastes, and smells are inseparable from extension, since the
experiences in these domains simply have no phenomenal place dimensions.
Berkeley does maintain, and insistently so, that we cannot make sense of an
abstract idea of extension. We can not experience or imagine a visible or tan-
gible extension as it is, bereft of all other sense qualities. There are no such
items as property-less places for the term “extension” to denote. Extension
understood so as to apply to places having no sensible qualities is unintelli-
gible and unimaginable, just as applying the concept “triangle” to a figure
that is not scalene, isosceles, or of any other determinate triangular shape is
incomprehensible. The generic ideas “extension” and “triangle” can be prop-
erly used to describe and denote actual experiences of both sight and touch.
It is when these ideas are employed too abstractly that the terms “extension”
and “triangle” are devoid of empirical content and cognitive meaning.
Conclusion
Berkeley, like most others, assumes that the experiences of each modality are
qualitatively distinct. Sensations of sight, sound, touch, smell, and taste are
not at all like one another phenomenally. Hence, cross-modal linkages can
not be explained in terms of similarity or resemblance of qualities. He also
thinks it implausible that the appropriate connections could be established
by reason. No amount of thinking about the smell of an item will enable you
to determine in advance what the phenomenal experiences of color, taste,
sound, and resistance it affords will be like. Sight and touch, though, both
have place qualities, and each can be ordered with respect to their own place
locations. So it may and did seem obvious to many that nothing should pre-
vent shape ideas from being common sensibles. According to Berkeley, the
problem with this suggestion is that experienced visual extension itself is not
phenomenally like experienced tangible extension. Although both are un-
ambiguously called extensions, the extensions are incommensurable. They
cannot be combined, and arithmetic or geometric operations that attempt
to do so make no sense. There is no common unit of pure extension that can
serve to measure, compare, or unite visible and tangible extensions. Indeed,
the very idea of extension as it is, devoid of any of its accompanying visual
or tangible qualities is incoherent. It presumes the very kind of abstraction
Berkeley claims is unimaginable.
Crucial to Berkeley’s argument throughout is the distinction between spa-

tial properties of the environment as determined by a sense and extensional
dimensions and properties of a sensory domain itself. One can, for example,
judge distance to an object by the strength of its odor or the loudness of its
sound, but this does not mean that these sense realms have inherent phe-
nomenal extensions in which their own phenomenal places are ordered.
Berkeley, in fact, notes that it could be and in certain circumstances is possible
to evaluate distance by what he calls faintness—the fainter the visual experi-
ence, the more distant the physical object. Correlating visual faintness with
physical distance is much like evaluating distance by smell or touch. Faintness
can serve as a sign of physical distance. Although faintness may thus pro-
vide distance information, an ordering of visual faintness is not an exten-
sional place order. And the correlation of visual faintness and tangible
distance does not depend on sharing phenomenal properties or having
sensible ideas in common.
Berkeley recognizes that in everyday life it is most difficult to appreciate the
incommensurability of visual and tangible extensions. Entrenched habits of
association and the use of a common descriptive vocabulary obscure the real
situation. In addition, we tend to be misled and conflate the two extensions,
when for instance, we see (or imagine seeing) our hand traverse a circular
path around a dinner plate and conclude that the visual and tangible shapes
are alike. But this is a mistake. In such cases, we are actually comparing or jux-
taposing two visual shapes, not a visible with a tangible.11 If, instead, atten-
tion is focused, as it should, on the actual tangible sensations of resistance
and motion had during such a traversal, the conviction that circular arrays
of vision and touch are phenomenally identical or similar experiences no
longer seems as obvious. In NTV Berkeley seeks to convince readers that al-
though vision and touch both have inherent place orderings, and the place
orders can be correlated, there is no need to presume an abstract idea of
“extension” or a single extension in which the phenomena of distinct modal-
ities are organized and ordered.12
Notes
* All bracketed section references are to Berkeley’s New Theory of Vision. I wish to thank
Margaret Atherton and Laura Berchielli for comments on an earlier draft. I have also
benefitted from reading some unpublished work of Martha Bolton on these issues.
1. See M. Wilson, “The Issue of Common Sensibles in Berkeley’s New Theory of Vision”
in Ideas and Mechanisms, pp. 257–75. Princeton: Princeton University Press, 1999, and
L. Falkenstein “Intuition and Constructivism in Berkeley’s Account of Visual Space.”
Journal of the History of Ideas 32, 1994, pp. 63–84.
2. See G. Evans, “Molyneux’s Problem” in Collected Papers, pp. 364–99, Oxford: Oxford
University Press, 1985.
3. Elsewhere in statements of his thesis, Berkeley replaces the term “idea” with the
expression “sensory idea.”
4. For many theorists, including Berkeley, differences in their qualities is the basis
for individuating sensory modalities. This topic is explored in a number of papers
in Perception.
5. I leave at present the issue whether physical objects and properties are to be identi-
fied with tangible experiences as Berkeley tends to do in the NTV or whether the notion
of a “physical object” is better understood as a composite of experiential material from
all sense domains as Berkeley seems to hold in his later, more explicitly Idealist, works.
6. Note that physical places on the retina may be occupied, but the stimuli may not be
of sufficient size or strength to trigger visual sensations.
7. I avoid the further complications that arise in the case of shapes that cannot be spec-
ified by a single structural analysis. For example, structurally different arrays may all
fall under the concept of “the letter A.”
8. See chapter 11, figure 11.4.
9. See chapter 3 for possible qualifications in the case of a single MV.
10. After all, we can talk separately and meaningfully of brightness, saturation, and
hue although no color can be experienced without all three.
11. This sort of conflation is surely one reason people so readily assume, as mentioned
earlier, that the ovoid shape of a lemon must be a common sensible.
12. Chapter 5 spells out the implications this essay has for understanding Berkeley’s
answer to the Molyneux question and “man born blind” thought experiments.
Prescript 5
The account of Berkeley’s heterogeneity thesis in chapter 4 stopped short

of squaring the analysis with Berkeley’s well-known negative answer to the
Molyneux problem. This follow-up essay tries to remedy the situation. It
places emphasis on the fact that Berkeley appeals to “man born blind” thought
experiments throughout the NTV, not just in his discussion of figure. Con-
sideration of Berkeley’s arguments in these cases is important in understand-
ing his answer to Molyneux’s specific question. Several alternative accounts
of Berkeley’s goals and position are critically examined.
5 What Berkeley Sees in the Man Born Blind*
Chapter 4, “Heterogeneity and the Senses,” maintains that Berkeley’s treat-

ment of common sensibles is compatible with ideas of “number,” “distance,”
and “size” applying univocally to sight and touch experience. It also argues
that Berkeley need not have qualms with unambiguous ideas of shape, being
applicable to both sense realms. To some, this second claim will seem much
harder to swallow than the first. Berkeley specifically says, in [127] and else-
where, that there are no abstract ideas of figure common to both senses, and
his famous negative answer to Molyneux’s question supposedly underlines
the point. Chapter 4 argues, nevertheless, that Berkeley can recognize shape
predicates as common to sight and touch without undermining his hetero-
geneity doctrine.
The trick is simply to treat shape predicates, along with number, distance,
and magnitude predicates, as general terms denoting the structure of arrays
in visual extension and tangible extension. That an idea is generic does not
mean it is epistemically or ontologically suspect. As long as the items denoted
by a general term are all anchored in sense experience, abstractness is not a
difficulty. Still, accepting this account of figure does seem to pose a problem.
For if Berkeley admits that generic shape predicates are common to sight and
touch, how can he remain so sure that the “man born blind” (MBB) will not
succeed at Molyneux’s task?
A closer look at Berkeley’s actual response to Molyneux in the New Theory
of Vision (NTV), indicates that, in fact, he is not as confident as is usually
thought that the MBB will fail this particular test. He says, “[I] am of the opin-
ion that the blind man, at first sight, would not be able with certainty to say
which was the globe and which the cube whilst he only saw them” [132, em-
phasis added]. Berkeley’s reticence concerning figure stands in contrast to his
firm negative answer to the other MBB thought experiments proposed in the
NTV. Prior to his discussion of shape, Berkeley considers the perception of dis-
tance, magnitude, and orientation, and he appeals to MBB tests in each. In
these cases, the cautionary “with certainty” does not qualify his predictions.
He says in [41] that the MBB’s inability to perceive distance on gaining sight
“is manifest.” In [79], he asserts that “we may safely deduce” that a MBB will
initially fail in his attempts to judge the magnitude of objects placed before
him. And in his account of orientation, Berkeley claims “it plainly follows”
that the MBB “would not at first sight think that anything he saw was high or
low, erect or inverted” [95].
Exploring Berkeley’s treatment of these other MBB thought experiments, I
believe, provides important context for understanding his No answer to the
question Molyneux poses. For it is most unlikely that interpretations and
criticisms of Berkeley peculiar to his treatment of shape perception can get to
the heart of his views. Berkeley’s account of figure is part and parcel of his
overall theory of spatial perception and must find a place within it. Paying at-
tention to the full range of MBB thought experiments in NTV can also help
explain why Berkeley is guarded in his answer to Molyneux.
Necessary Connections and Learning
Throughout the NTV, Berkeley tends to take it for granted that if a connection
between ideas is not necessary, it must be learned and vice versa. Without this
assumption, the probative value of empirical evidence resulting from MBB
experiments is dubious. Yet both opponents and supporters of Berkeley’s the-
ory of vision have held that this critical assumption is not correct.
Many agree with Leibniz, who argues against Locke that ideas may be neces-
sarily connected without reason being aware or immediately able to appreciate
that they are. It can take some thought to figure things out. Mach challenges
the significance of negative Molyneux findings along different lines.1 He
points out that both humans and animals often are unable to recognize two
presentations of a shape as the same if the figure is experienced in different ori-
entations. For instance, people are frequently unaware that the diamond shape
they perceive is a square rotated 90 degrees. Hence, Mach argues, mere failure
to appreciate shape identity does not support a strong heterogeneity doctrine.
Alternatively, Mill defends Berkeley from critics who say that his theory of
vision is refuted by empirical evidence concerning animals, and perhaps ac-
tual MBB experiments. Mill argues that it is not damaging to Berkeley’s over-
What Berkeley Sees in the Man Born Blind 73
all thesis that the newly sighted may be able to navigate the environment
without prior experience. After all, a sound might be innately set to trigger an
experience of fear, although the experiences of sound and fear are not alike
and have no necessary connection. Correlations of very distinct ideas can be
wired in at birth, and Mill suggests that the proper explanation of evidence
conflicting with Berkeley’s predictions could be that the correlations are in-
nate. That aspects of sight and touch are correlated at birth does not show
that spatial ideas of the two senses are similar or related by necessity.2
It is not surprising Berkeley did not contemplate the evolutionary possibil-
ity that the experience and fate of past generations can alter the capacities of
their descendants. On the other hand, Berkeley is not in a position to rule out,
a priori, the possibility of these sorts of innate linkages. God could have set
things up so that the language of nature is not only uniform in all environ-
ments, but is given to everyone as a birthright.3 As the history of MBB exper-
iments indicates, though, Berkeley was not the only one to run together issues
of innateness with claims of heterogeneity.
Initial Experience
In [130] Berkeley says “in a strict sense, I see nothing but light and colours
with their several shades and variations.” He says similar things in other
places, sometimes substituting “immediately see” for “in a strict sense see.”
These statements can encourage the view Berkeley held that, at least initially,
the visual field is without internal organization or that the structure it does
have can not be appreciated. On these assumptions, it would be impossible
for the MBB to judge or navigate his environment on first gaining sight, thus
explaining Berkeley’s negative answers to all the MBB thought experiments.
There are a number of reasons why I do not think this is the correct under-
standing of Berkeley’s position: (1) Berkeley never explicitly says that visual
extension is unorganized or its organizational features inaccessible at any
stage of development, and I do not believe quotes like [130] indicate that he
endorses such positions. (2) The assumption that the visual field of the MBB
(or a newborn) on gaining sight is unorganized or its ordering of no useful im-
port does not accord well Berkeley’s and other visual theorists’ characteriza-
tion of the problems of spatial perception. Nor would such an explanation of
the MBB’s failure help Berkeley support his own account of these issues. (3) It
does not explain why Berkeley is more reticent in the case of shape than in his
answers to the MBB thought experiments for distance, magnitude, and ori-
entation. If at test time the MBB’s visual field is without discernable structure,
why should Berkeley be more cautious about figure than he is with other spa-
tial properties? I assume objection (3) needs no defense: (1) and (2) do, and I
will address each in turn.4
Immediate Perception
Although Berkeley does say in several places that all we immediately see is
light and color, in other passages he is not so limiting in his characterization
of immediate perception. For example, in Theory of Vision Vindicated [TTV,
44] he maintains, “The proper immediate object of vision is light, in all its
modes and variations, various colours in kind, in degree, in quantity; some
lively, others faint; more of some and less of others; various in their bounds or
limits; various in their order and situation.” (Emphasis added.) Later he explains,
“These immediate objects [of sight] are the pictures. These pictures are some
more lively, others more faint. Some are higher, others are lower in their own
order or peculiar location . . .” [TTV 54, emphasis added]. What’s more, there is
a perfectly good interpretation of statements like [130] that does not have the
implication that the visual field is initially, or for that matter ever is, without
appreciable phenomenal order.
In discussing the nature and function of sensory systems it was quite cus-
tomary (and to some extent remains so) to individuate modalities in terms of
the qualities they present. Strictly speaking, the phenomenal product or ob-
ject of our auditory system is sound in all its variations (loudness and timber);
that of the palate is taste, that of olfaction is smell, that of touch is pressure,
and that of sight is light and color. Theorists from ancient times on, includ-
ing those committed to common sensibles, were quite willing to characterize
the immediate objects of perception in just this way. Light and color are the
experiential objects or qualities that constitute and differentiate the sensory
domain of vision.5
There is nothing, however, in this standard specification of the proper ob-
jects of the modalities that precludes the products of sense from having an ex-
perienced internal phenomenal ordering. In particular, it does not mean that
visual extension and tangible extension, of either the MBB or infants, are orig-
inally without useful structure. Indeed, Berkeley would be especially hard put
to get his motor theory of vision off the ground if the fact that felt pressure is
the proper object of touch means that tangible experiences are initially unor-
ganized and bear no place relations to each other. Berkeley does maintain that
vision and touch are special in having phenomenal place orders. Other sense
organs may be employed to evaluate spatial relations indirectly, but these
modalities, unlike sight and touch, do not have extensions of their own.
I believe an ambiguity in the notion of “strictly see” or “immediately see”
is a source of some of the confusion in discussions of this issue. Presented with
a stimulus that triggers a circular yellow visual array, people who do not have
the concept “circular” will not judge or describe the array as circular, and they
may have no reason to segregate the circular array from adjacent parts of the
visual field. Nonetheless, if all points on the perimeter of a solid array of
yellow are phenomenally equidistant from a point in the center, the yellow
patch has a circular shape in visual extension.6 We see a circular array, al-
though we do not see it as being circular and may have no reason to separate
or discriminate the figure from its phenomenal surroundings. Failure of the
subject to conceptualize the array as a circle, does not prevent figure/surround
type descriptions from being applied to the visual field.7 In addition, if asked
or tested, a subject may have no difficulty distinguishing the yellow colored
array from, say, the black array that borders it.
Spatial Perception
In presenting the MBB thought experiments, Berkeley writes as if there is no

question that at test time the visual field of the MBB has recognizable struc-
ture. In discussions of figure, he allows that the newly sighted can experi-
ence visible circles and squares. In his account of inversion, Berkeley admits
that on gaining sight a visible man’s legs will appear to the MBB next to the
visual ground, and the man’s visual head will appear closer to the visual sky.
These claims would not make sense if the MBB’s visual field is without inter-
nal organization.
Berkeley’s adoption of this conception of the problem is not difficult to ex-
plain. Well into the twentieth century, it was quite common to assume that
vision was a two-stage process involving a transition from sensation to per-
ception (or in Berkeley’s terms, from immediate to mediate perception). Bar-
ring injury, physical fatigue, drugs, and the like, the outputs of sensory systems
were immediate and fixed. The qualitative natures of sensations did not de-
pend on learning, nor could they be altered as a result of learning or thought.8
The qualities of sensations were wholly determined by the physiology of the

sense organs and the stimulus properties to which they were responsive. Per-
ception required interpreting this immediately given sensory core.9 We auto-
matically “read through” sensations to their perceptual meanings, and were
often unaware of the actual phenomenal qualities of the triggering sensa-
tions. Theorists did differ in their accounts of the origins and kinds of pro-
cesses that led from sensations to perceptions. Many stressed the influence of
past sensory experience and cognition; others appealed to a priori knowl-
edge, biases, or constraints. The need for some sort of distinction between im-
mediate sensation and mediate perception itself went largely unchallenged.
It was also widely held that there are correspondences between properties
of the retinal stimuli and the sensations they trigger. As an object moves
closer, the size of the physical image projected on the retina grows larger. This
results in an increase in the magnitude (the number of visible points) of the
visual array. Parallel railroad tracks receding from view, however, project
converging lines on the retina and so there are fewer visible points between
them in visual extension. Likewise, a circle projects a different retinal image,
and hence causes a different sensation, when tilted than when on a fronto-
parallel plane. On a fronto-parallel plane it has a circular appearance; off this
plane it appears elliptical. Movement of either perceiver or object usually al-
ters both the projected retinal images and the sensations they trigger.
Within this correspondence framework, a central problem for a theory of
spatial perception is to explain how a world of persisting objects with stable
spatial properties is derived from a stream of visual sensations constantly
changing with the movement of either the observer or the observed. Or to put
the issue in more contemporary terms, given the constancy hypothesis (sen-
sations bear a correspondence to the retinal image), how is it that we perceive
the world with constancy (in other words, the correct constant size, shape,
and orientation of things in the environment)?10 By all accounts Berkeley de-
veloped his theory of vision within this paradigm. Indeed, in TVV he offers a
version of the constancy hypothesis, and explains in detail how the visual
field is proportional to the retinal image [sects. 53 ff].
Throughout the NTV, Berkeley’s discussion of spatial perception is couched
in constancy-like terms. His account of magnitude in sections 52–87 of the
NTV, for example, leaves no room to doubt that this is how he understood
the situation. And Berkeley’s description of the moon illusion is a striking ex-
ample of just how dependent his analysis is on a version of the constancy hy-
pothesis. Although the moon looks bigger on the horizon than at its zenith,
Berkeley insists that what is immediately seen is the same size in both loca-
tions. The sensations that prompt the illusion do not change in magnitude,
because the size of the retinal image the moon projects remains constant. The
moon illusion is a perception. We read through the constant sensation to an
illusory perception.
However other theorists conceive the MBB’s initial visual experiences, Berke-
ley assumes they bear a proportional relation to retinal image stimuli. The
MBB’s task is not conceived to be a practical impossibility, as it would were the
MMB unable to tap any structural features of visual extension. Berkeley’s ar-
gument is on a more theoretical plane. In the Molyneux experiment the ques-
tion put to the MBB is not “Do you discern any pattern at all in your visual
experience?” Instead, he is asked whether he can see “which is the globe,
which the cube” [132] (the challenge for the MBB is to determine which array
in his visual field is of the tangible globe and which of the tangible cube)? The
question seems to suggest that the MBB gives some content to the demonstra-
tive elements embodied in the asking. True, Berkeley does say that on gaining
sight the MBB is likely to be somewhat baffled. He attributes this to two fea-
tures of his test situation. First, Berkeley believes that initially the MBB would
not perceive anything as being anywhere but in his own mind. Second, the
MBB will not have any good reason to separate or draw figure/surround bound-
aries one way or another. Nothing in principle, though, prevents their being
salient. Berkeley does not base his MBB predictions on these factors, and re-
moving such sources of confusion will not ensure passing the test.
More significantly, should the MBB’s failures be due to either confusion or
a lack of order in his visual field, the MBB thought experiments would be of
less use to Berkeley. Both Berkeley and his critics agree that the MBB will ac-
quire the visual ability to discern physical figures and will adopt the standard
spatial vocabulary to describe them. The difference is that one party to the de-
bate attributes these accomplishments to resemblances or necessary connec-
tions. Their Berkeleian opponents reject phenomenal similarity or necessary
connections as the explanation.
Berkeley realizes people have strong intuitions that visual figures can and
do resemble their tangible figure counterparts, and he understands the rea-
sons for their view. Acceptance of the constancy hypothesis promotes the at-
titude, as does the fact that we automatically read through visual sensations
to their tangible meanings. The use of the same terms to describe properties
in both sense realms also has a major influence. And a penchant for con-
fusing the visual perception of tactual exploration of space with the tan-
gible sensations experienced during tactile exploration is another source of
the conflation.
Berkeley sees the need to address these mistaken views. His goal is to show
that visual and tangible shape experiences are distinct in spite of the fact that
arrays of visible extension and tangible extension have discernable figures.
Once the visual and tangible realms become correlated, however, it is more
difficult to appreciate that they are neither related by resemblance nor reason.
The MBB thought experiments are meant to help overcome these prejudices
that come along with the acquisition of visual skill and linguistic sophistica-
tion. But it is important to keep in mind that Berkeley’s ultimate goal is to
prove that, despite indications and intuitions to the contrary, sight and touch
are always heterogeneous. They remain distinct after, as well as before an in-
fant or MBB coordinates visual and tangible extensions and acquires visual
skill. The MBB experiments are germane to this overarching goal only on the
assumption that what the MBB immediately sees is essentially the same as
what the sighted strictly sees.11
Learned Organization
Of course, this conception of the problem of spatial perception does not, by it-
self, rule out the possibility that visual extension initially has no (appreciable)
structure. The MBB (or newborn) may first have to put visual extension into a
usable form. Only after this has been accomplished can learning of sight and
touch correlations take place.12 Although this developmental scenario is a
possibility, it is not one that Berkeley could readily accept. Berkeley’s and his
opponents’ descriptions of the thought experiments require the MBB make
his judgments on first gaining sight. There is no time available for the postu-
lated internal organizational process to occur. Moreover, were this objection
finessed, another puzzle arises. The only resource that seems available for the
MBB to use in putting his visual field in order is correlating it with touch. But
if this is the story, it then becomes questionable whether the various feats of
associative learning Berkeley says the MBB must undertake would be needed.
Work that Berkeley says lies in the MBB’s future would be accomplished as a
result of bringing this initial structure to his visual field. This last point may
be more transparent in the following discussion of perceptual orientation.
The Inverted Image
For centuries, attempts to determine the physical optics of vision were stymied
because the retinal image is inverted. This was assumed untenable, since the
world does not visually appear upside down. Once Kepler convinced the sci-
entific community that retinal inversion is actually the correct account of the
optics, theorists felt an urgent need to explain how it is, then, that we see
things upright. Vision scientists devoted much time and effort attempting to
find the answer.
Berkeley examines the inverted image puzzle in the sections of the NTV de-
voted to the perception of orientation. He says that understanding his views
on this topic is key to understanding his theory of spatial perception in gen-
eral. Berkeley’s celebrated proposal for dealing with the inverted image puzzle
is to claim that it is bogus. The assumption that the retinal image must some-
how be re-inverted is misguided. It is another case where a conflation of visual
and tangible extensions hampers appreciation of the actual situation. With
proper attention to these matters, the inversion puzzle cannot get off the
ground. The retinal image, being a physical display, is inverted with respect to
our physical body. So Berkeley claims it makes no sense to compare the direc-
tion of the tangible retinal image with arrays in the phenomenal visual field.
Therefore, there is nothing to reconcile.13
Visual extension and visual arrays do not have any location or orientation
in environmental space, neither at the start nor later in life. It is simply a con-
fusion to imagine that the extensions of the two sensory realms are continu-
ous, contiguous, or can share a phenomenal space. It is impossible to combine,
superimpose, or align visual and tangible arrays and compare their relative
orientation. The visual field does not sit atop a background of physical space
that either can determine or provide a fixed point to set its direction. Visual
field arrays have no physical orientation whatsoever. Visual legs are next to
visual earth, but this nextness ordering can not be characterized in terms of
the physical properties of right, left, up, or down.
Explaining how the visible and tangible realms become coordinated re-
mains a genuine problem. It is a problem, however, that arises independently
of the optical inversion of the image on the retina. Berkeley himself has a
story to tell about how vision and touch become coordinated. The correla-
tions are learned.14 Neither a newborn nor the MBB could at first judge the
environmental orientation of what they initially see.
Berkeley has no qualms accepting the idea that on gaining sight, the MBB
immediately sees what those with developed visual skills in a strict sense see.
He never says otherwise, and his talk of relations among visual legs, heads,
earth, and sky assumes this is so. Once again, Berkeley’s argument is that in
spite of having their own directional orderings, visual and tangible extensions
are incommensurable. Berkeley’s position is sometimes obscured by his claim
that the MBB could not use number information to aid his cause—for ex-
ample, that two visible legs go with two tangible legs. His point here is that
cardinality measures presuppose a unit of counting. The question “How
many?” cannot stand alone. As he says in NTV, a window, a chimney, a house,
and a city may each be called one, and a picture surface may feel like a single
uniform surface, yet contain many painted shapes in many colors. (See chap-
ter 4.) The MBB, however, has no basis for segregating leg-shaped visual arrays
from the rest of his visual field and no inclination to use “a visual leg-shaped
figure” as a unit of measure.15
If the MBB is assumed to confront the orientation test with an organized
visual field on hand, there could be only two explanations for this initial or-
ganization. Berkeley’s choice among them is clear. The order of visual exten-
sion, like other inherent orderings of sensations, is fixed by the nature of the
sense organs. The alternative account—that useful structure is acquired—is
not a viable option for Berkeley. The MBB has no time to accomplish the task
prior to his gaining sight. And if this objection is skirted, a puzzle still re-
mains. The initial ordering of visual extension would have to be achieved via
correlations with motion and touch. But once these visual and tangible con-
nections are on hand, central aspects of physical directionality would be too.
Thus the MBB would have already acquired directional skills that Berkeley
says he still needs to acquire.16
A Non-Berkeleian Resolution
Gareth Evans’s essay “Molyneux’s Question” is one of the most discussed ar-
ticles on the topic.17 Evans’s paper provides an excellent overview and com-
mentary on assorted versions of the problem and attempts to solve them. He
separates Berkeley’s heterogeneity thesis from claims of innateness and he as-
sumes, with Berkeley, that the blind can have an idea of space as a simultane-
ous whole. He also assumes that at the time of testing the MBB can experience
visual figure, and that figure/surround difficulties, if present, are not the cen-
tral issue. According to Evans, the position of his representative Berkelean,
“B,” is that in spite of the MBB being able to appreciate shapes in visual exten-
sion, he will fail.
After this ground-clearing, Evans goes on to argue that the best way to
bring Berkeley’s real concerns into focus is to reformulate Molyneux’s ques-
tion along the following line: Could a person master shape concepts in the
tangible domain, yet fail to be able to apply them to shapes found in visual ex-
perience?18 According to Evans, if the answer is yes, Berkeley’s position is sus-
tained. If it is no, Berkeley’s negative response to Molyneux is a mistake.
Evans’s argument, in the end, is to challenge the coherence of the claim
that the MBB can be said to see visual figure, yet cannot apply tangible shape
ideas to certain figure-relevant features of his visual arrays. There is, Evans ar-
gues, a conceptual connection between the ability to orient in physical space
and the mastery of visual shape concepts. In particular, upon gaining sight
the MBB cannot be said to appreciate visual figure, unless his new sight expe-
riences are coordinated with appropriate behavioral dispositions or informa-
tion about direction in his immediate physical environment. Without such
visual and behavioral correlations, Evans maintains, the idea that the MBB
has experiences of visual figure is otiose. Evan’s answer to his own version of
the Molyneux question is that Berkeley is not entitled to assume that the MBB
has experiences of visual figure without also admitting that the MBB can as-
sign tangible spatial direction to the visual shape boundaries. This suppos-
edly raises a problem, because Evans is convinced Berkeley does assume that
on gaining sight the MBB experiences visual figure.
Evans, however, does not challenge Berkeley’s full blown theory of spatial
perception. He allows that it is not necessary for specific distances or depth re-
lations to be in place in order to attribute concepts of visual figure. The newly
sighted MBB, may, as Berkeley claims, lack the ability to judge spatial distance
or depth by sight. So the MBB may not actually be aware that the boundary
points of an experienced visual figure lie on a single plane in physical space.
To experience visual shape in Evans’s minimal way, it is only necessary to be
able to assign visual arrays appropriate egocentric direction. The perception
of visual shape requires encoding or representing the egocentric direction of
boundary points in the visual field. Such appreciation of direction in behav-
ioral space, he maintains, is constitutive of the very notion of having visual
shape experience.
Evans offers an analogy. Consider, he says, what it would mean to attribute
mastery of auditory concepts of spatial properties. The test would be whether
the person can employ experiences of sound to guide behavior. The person
need not demonstrate a full-blown understanding of physical space, but hav-

ing a certain body-centered egocentric appreciation of the immediate envi-
ronment is essential. For example, an auditory perception of direction must
manifest itself in the person’s knowing how to orient or point toward the
source of sounds. As a consequence, someone can not be said to have mas-
tered auditory spatial direction unless he or she has a disposition to behave
appropriately with respect to egocentric physical space.
A Berkeleian Response
If the interpretation I have offered of Berkeley’s project and position is correct,

he would be unsympathetic to several aspects of Evans’s analysis. Berkeley
would agree with Evans about auditory spatial concepts. Sensory apprecia-
tion of physical or tangible properties, in general, is demonstrated by being
able to use experiences from a modality to guide one’s activities in the envi-
ronment. This is at the heart of Berkeley’s motor theory of perception, and au-
ditory concepts of space are no exception. For Berkeley, however, there is an
important difference between sight and hearing. Vision has its own exten-
sion and extensional properties; auditory experience has no inherent phe-
nomenal places. There is no audible extension. Audibilia are ordered with
respect to loudness and timber, not location. Only visual and tangible expe-
rience have sense-specific extensions, and Berkeley argues their extensions
are distinct and incommensurable. It is unlikely, therefore, that Evans’s audi-
tory analogy would move Berkeley.
Berkeley has another reason to oppose Evans’s solution to Molyneux’s
problem: Berkeley offers several arguments intended to show that there are
no necessary connections between sight and touch. If Evans’s notion of
“conceptual connection” is tantamount to there being such a link, Berkeley
believes he has empirical and theoretical grounds for denying the claim.
Berkeley would not be impressed that Evans and others have strong intu-
itions about conceptual necessities. From the start, Berkeley saw a need to un-
dermine these entrenched convictions about the nature of spatial perception
and spatial properties.
In addition, Evans is unclear with regard to what appropriate behavioral
know-how is needed to meet his egocentric behavioral criterion for attribut-
ing visual figure. If Evans requires that this behavior correspond to the actual
physical environment, the criterion looks too strong. Consider, for example,
the following proposal for coping with the inverted image problem Kepler ex-
posed. The initial visual experience of infants or the MBB has things looking
upside down, and spatial behavior is ill-suited to the environment. Subse-
quent experience establishes visual/tangible correlations that provide the
wherewithal both to invert the way things look and navigate space success-
fully. The visual field has structure from the start—visual legs on the ground,
visual head skyward; nonetheless, initially behavioral responses will be mis-
guided. Studies of people wearing glasses that invert the visual image on the
retina do indicate that something like this is what happens when they are
first put on.
In contrast, behavior may be appropriate to egocentric space, although
visual experience does not jibe with the physical layout. For example, have
someone move her hand up and down the edge of a door. While doing this,
have her don glasses that curve the image on the retina (a straight line proj-
ects a C shape on the retina). Often a subject can continue to move her hand
according to instructions, keeping in touch with the straight door edge, yet
she will report that the door edge looks visually curved. Moreover, sight tends
to dominate touch, and subjects report that their hand tangibly feels like it is
moving along a curved path.
An examination of the literature on perceptual adaptation reveals a host
of fascinating phenomena that are hard to describe, let alone explain. Might
such mismatches between behavior and visual phenomena cause problems
for Evans’ conceptual connection claim? I am not sure. Evans is aware of such
psychological studies of perceptual adaption and the empirical and theoreti-
cal puzzles they raise.19 Evans acknowledges, too, that the issues need more
study. Lacking a fuller statement of Evans’s position on adaptation, I am re-
luctant to push the argument further.
Finally, it is worth noting that Evans’s solution to the Molyneux problem
does not dispute Berkeley’s claims about seeing distance and size. Experiencing
figure in Evans’s sense does not require getting these spatial properties right.
So questions arise whether Evans’s account of figure can be applied to other
aspects of spatial perception and to the other MBB cases Berkeley discusses.
Why is it, though, that Berkeley is so willing to believe that the MBB does
experience figured visual arrays? I think the answer lies in Berkeley’s accept-
ance of a version of the constancy hypothesis. Everyday experience and sci-
entific study seem to reveal that there is a proportionality between features of
the retinal image and features of visual experience. Give or take a little, if the
tangible image projected on the retina is straight, the visual array experienced
is a straight line in visual extension. If the retinal image is curved, the visual
array shape changes accordingly. These properties of sensations, Berkeley as-
sumes, are fixed by the sensory system. So if the MBB’s visual system is at the
start in normal working order, Berkeley does not feel it necessary to defend
the claim that the MBB can immediately experience a figured phenomenal
visual field.
My Picture
I also think Evan’s reformulated version of the Molyneux question does not
capture what is primarily at stake for Berkeley. Recall, in my interpretation,
Berkeley can and should accommodate the possibility that the MBB, prior
to being tested, may have generic figure ideas that apply to sight and touch.
In principle, then, the MBB on first gaining sight might be able to apply shape
terms to arrays in both modalities. Nevertheless, figures in the two senses are
experienced as phenomenally distinct sensory ideas. Conflation of visual
experience with tangible experience often misleads. It is very easy to fall into
the trap of taking the comparison of two visual experiences for a comparison
between a visual and tangible experience. We fail to distinguish properly the
visual experience of tangible movement with the tangible sensory content
of the movement itself. For example, we observe someone, perhaps ourself,
running a hand around the perimeter of a dinner plate. We notice the path
the hand takes is circular and conclude that the tangible and visible experi-
ences are qualitatively alike. But this is a conflation. We are not actually com-
paring visual experience to tangible experience. We are comparing visual
experience of a circular object with the visual experience of a hand tracing the
object’s perimeter.
Should the MBB possess generic ideas of phenomenal shape, as argued
above that he may, his passing the Molyneux test can not be ruled out with
certainty. His judgments, though, will depend on considerations of fitness,
not resemblance or necessary connections. “Square,” “circle,” and other ideas
of figure can be given generic definitions that make them conceptually appli-
cable to sight and touch. If the MBB pays attention to these abstract ideas,
they can influence his psychological intuitions of fitness. Two arrays that fall
under the same label may seem more suited to one another than arrays that
do not. Berkeley’s qualified answer to Molyneux in NTV suggests that he is

aware that the MBB may have this sort of conceptual help available.
The Berkeleian picture I have sketched holds that visual extension and tan-
gible extension are incommensurable. Both are ordered in accordance with
a phenomenal “nextness” relation—Berkeley calls it “adjacency.” These or-
derings specify which sensible points are adjacent to which others. Nextness,
here, is not to be understood univocally as “next to” in physical space. Next-
ness is a generic concept holding between phenomena that are adjacent in a
sensory order. Nextness relations can be used to determine distance measures
within each sense realm. As Berkeley pointedly says [112], “those things only
are compared together in respect to distance which exist after the same man-
ner, or appertain unto the same sense. For, by distance between any two
points nothing more is meant than the number of intermediate points: If the
given points are visible the distance between them is marked by interjacent
visible points: If they are tangible, the distance between them is a line con-
sisting of tangible points.”
Phenomenal adjacency relations in visual extension are independent of
nextness relations in tangible or physical space. Properties of arrays in visual
extension can be coordinated with properties of tangible extension, but the
arrays so linked remain heterogeneous. The two extensions are entirely sepa-
rate and their orderings are, in fact, characterized by different geometries. In
[112], Berkeley continues “if they [the two points] are one tangible and the
other visible, the distance between them neither consists of points perceiv-
able by sight nor by touch, i.e. it is utterly inconceivable.”
Berkeley’s Reticence
I have indicated why Berkeley has reason to be somewhat reticent in his an-
swer to Molyneux. But why is Berkeley not similarly cautious in his other
MBB predictions? I think an explanation of the difference can be found in a
distinction between shape concepts and concepts of distance and magni-
tude, mentioned in chapter 4. Figure is a structural property. Distance and
magnitude, per se, are not. Structural properties of arrays, though, may aid in
cross-modal tasks.
Berkeley says that a visible square may be fitter than a visible circle to rep-
resent a tangible square. It is fitter, because the generic definitions of “square”
and “circle” apply to arrays in both domains, and relations among their parts
are structurally akin. Nevertheless, phenomenal square experiences of vision
(color and light) and phenomenal square experiences of touch (pressure)
neither resemble nor are necessarily connected. They are incommensu-
rable. Square visual arrays can not be moved next to square tangible arrays
and compared for shape. We have no idea in either thought or imagina-
tion what it would be to experience a unified figure combining them both. It
is inconceivable.
On the other hand, distance and magnitude in visual and tangible arrays
are not structural properties. The “one point argument” [2] entails that a dis-
tance in visual extension can be a reflection of any distance in physical space.
Similarly, there is no fixed correlation between visual and tangible magni-
tudes. An inch-long object can be experienced as a single minimum visible or
as occupying the whole visual field. This is the problem faced in going from
the flux of sensations to stable perception. Absolute size measures in the vi-
sual array do not support or favor any judgment of physical or tangible mag-
nitude and vice versa.
The situation is different if the task involves relative size estimates. Al-
though arrays of MV and arrays of MT are incommensurable, relational con-
siderations may favor certain cross-modal associations. Confronted with a
pair of objects differing in physical size, it is fitter (psychologically simpler) to
have the tangibly bigger array represented by the larger of two visual arrays.
An appreciation of this fitness can influence the MBB’s decision. The MBB’s
judgment, of course, is not certain. There is no qualitative resemblance or
necessary connection to ensure or underwrite his decision.
Herein, I think, lies the reason Berkeley is more guarded in the case of fig-
ure than he is with other features of spatial perception. When discussing dis-
tance and magnitude in NTV, Berkeley is not concerned with comparative
judgments, where relational facts may influence judgements of fitness. In the
case of shape, structural considerations can not be set aside. Armed with a
generic concept of shape, the MBB might intellectually come to appreciate
that the visible square and the tangible square are structurally similar. This
may bias the MBB’s answer to the Molyneux question in a manner that does
not apply to MBB thought experiments that do not depend on internal rela-
tional properties.
Still, all claims that these structural relations can help with cross-modal
tasks depend on the assumption that the physical items presented are at the
same distance and slant from the perceiver. Altering the distance or spatial
orientation of a physical object will affect its magnitude and figure in visual
extension. Depending on the angle of regard, the visual array of a physical
circle may be elliptical or even a straight line. A tangible square may appear as
a range of visual polygons, as well as a straight line array. And if removed far
enough away a circle or square may trigger no visual experience or visual
experiences that are phenomenally indistinguishable, say two or three MV
each. In discussions of the Molyneux problem it is usually assumed that the
circle and square are both on the same fronto-parallel plane and reasonably
close to the subject.20 Any advantage “fitness” considerations offer depends
on making such assumptions about the location and orientation of the phys-
ical objects being observed. Clearly, there are no conceptual connections that
can apriori assure the MBB of these facts about the environmental layout.
Conclusion
The goal of this paper has been to explicate Berkeley’s views, not defend
them. I do not deny that his heterogeneity doctrine faces difficulties. Set-
ting Berkeley’s work in the context of both historical and contemporary is-
sues in the theory of vision can shed light on points of contention found in
commentaries on his position. What I hope to do in subsequent work is
show how the interpretations presented in chapter 4 and elaborated here,
comport with Berkeley’s Idealism and related epistemological and metaphys-
ical theses.21
Notes
* Unless otherwise noted the numbers in brackets are to the sections in Berkeley’s New
Theory of Vision.
1. E. Mach, The Analysis of Sensations. New York: Dover, 1959, pp. 135–7.
2. J. S. Mill, “Bailey on Berkeley’s Theory of Vision” in Dissertations and Discussions,

Vol. 2. New York: Haskell House, 1973, pp. 84–119. The question of innateness is more
closely related to another major issue associated with the Molyneux problem, namely
finding the source of the idea of “space.” The main options were said to be the Ratio-
nalist position that the idea is innate, the Empiricist account that it is a construct of
experience, and the Kantian view that space is an imposition of mind on the form of ex-
perience. See E. Cassirer, The Philosophy of the Enlightenment. Princeton: Princeton Uni-
versity Press, 1962, pp. 108–15.
3. M. Atherton points out (Berkeley’s Revolution in Vision, Ithaca: Cornell University

Press, 1990) that this is a position that Malebranche, for example, adopted.
4. Later I will offer an explanation of Berkeley’s reticence in response to Molyneux.
5. See, for example, J. Mueller’s classic statement of the position (excerpted in Percep-
tion). Mueller takes it for granted that the defining qualities of vision are sensations of
color, light, and darkness, although he also maintains that extension is perceivable by
all the senses.
6. Distance here is not physical distance. Distance in visual extension is a measure of

the number of visual places between two points in the visual field. (See chapter 3 of this
book and Berkeley’s [112]).
7. I use the term “figure/surround,” not the more common “figure/ground,” in order to
avoid the concerns about three-dimensionality the latter raises. Although considera-
tion of figure/ground issues do play a role in many accounts of the Molyneux problem,
I do not think it crucial to understanding Berkeley’s own views about the MBB. For fur-
ther discussion of the role of conceptualization in early discussions of these matters, see
M. Bolton, “The Real Molyneux Question and the Basis of Locke’s Answer,” in Locke’s
Philosophy. G. A. J. Rogers (ed.). Oxford: Oxford University Press, 1994, pp. 75–99.
8. One might say they are cognitively impenetrable.
9. See G. Hatfield and W. Epstein, “The Sensory Core and the Medieval Foundations
of Early Modern Perceptual Theory.” Isis 70, 1979, pp. 363–84 and R. Schwartz, Vision,
Oxford: Blackwell, 1994.
10. These assumptions were eventually challenged by Gestalt psychologists and then
J. J. Gibson.
11. Opponents’ responses to empirical findings confirming Berkeley’s negative pre-

dictions are instructive. Critics tend to accept the confirming evidence, but dismiss its
significance. They argue that on gaining sight the MBB’s visual system is defective, and
these defects prevent the MBB’s initial visual experience from having the phenomenal
structure it would otherwise have. So they assert that negative MBB test results are not
relevant to questions about the similarity or spatial commonality of ordinary visual
and tangible experiences.
12. R. Lotze’s theory of local signs is often read to be an account of the process by which
an ordering is acquired through experience.
13. See Atherton op. cit. For a critique of Berkeley’s position, see L. Falkenstein, “Reid’s
Critique of Berkeley’s Position on the Inverted Image.” Reid Studies 4, 2000, pp. 35–51.
14. Again, questions of innateness and necessary connections are run together. Test-
ing Berkeley’s claims about orientation and learning was a major spur for experimen-
tation with lenses that invert or distort the image.
15. Notice, too, that were visual extension with no appreciable order, there would be
nothing special about the inversion of the retinal image. An un-inverted retinal image,
like images with other orientations on the retina, would pose the same problem.
16. This concern arises again in the next section.
17. G. Evans, “Molyneux’s Question” in Collected Papers. Oxford: Oxford University

Press, 1985, pp. 364–399.
18. Evans, correctly I think, denies that Berkeley’s negative answer to the Molyneux
question depends crucially on the fact that the original task involves distinguishing a
globe from a cube, rather than a circle from a square. In his analysis, Evans sticks to two-
dimensional shapes.
19. He cites I. Rock’s The Nature of Perceptual Adaptation. New York: Basic Books, 1966,
which provides a penetrating analysis of these issues.
20. The situation is somewhat different with a sphere and cube, since a sphere will
project the same visual array from all orientations. This difference plays a role in vari-
ous accounts of the Molyneux problem, but I do not think it is a major consideration
of Berkeley’s.
21. This requires a treatment of issues removed from those of specific concern to the-
ories of vision.
II Inference
Prescript 6
Chapter 6 surveys the results of a study of perceptual inference; a more de-

tailed analysis is found in VVTB. Examination of the historical development
of the issue reveals that there is no common understanding of what consti-
tutes “inferential processes,” and hence what makes a visual theory an infer-
ence model. There are not only a wide number of conceptions bandied about,
but they are often in conflict. Even when the conceptions are not strictly in-
compatible, they speak to a diverse range of concerns, relying on diverse cri-
teria to distinguish inferential and non-inferential theories. The addendum
is from VTTB. It provides further exploration of criterion 4, the notion of
inference perhaps most prominent in recent discussions.
Setting the ongoing controversy in the context of its history in vision the-
ory helps explain how the issues have become so unclear and entangled. But
further clarity and untangling of positions alone will not put the debate
on firm footing. Present disputes often depend on a mix of old ideas and as-
sumptions that, though reasonable in their time, are no longer tenable. As a
result, the discussion can degenerate to a point where nothing much of sig-
nificance is at stake. For example, it has been argued that the claim that vision
is direct and non-inferential should be rejected for a number of different rea-
sons (1) Light is the retinal stimulus for the perception of objects, not the ob-
jects themselves. Thus, the perception of objects and their properties must be
derived from properties of light. (2) In the case of reversible figures (such as the
Necker cube or the duck/rabbit picture), the same stimulus gives rise to two
distinct percepts. Such percept differences, therefore, can not be explained in
terms of properties of the light or the world. (3) In experiencing apparent
movement or subjective contours, the movement or contours experienced
are nowhere to be found in the stimulus. (4) Hallucinations or visions trig-
gered by direct stimulation of the brain can be indistinguishable from those
94 Inference
due to objects in the environment. So ordinary perception has content that

goes beyond what can possibly be given visually. Now, it is difficult to con-
ceive of Gibson being unaware of these phenomena or finding them serious
challenges. On the other hand, if the thesis of direct perception can founder
on these shoals, its empirical and theoretical substance should be put in seri-
ous doubt. But then the view that perception is indirect or inferential loses
much of its significance as well.
6 The Role of Inference in Vision*
The question whether perception depends on inference is a very old one that
simply will not go away. I think that a major reason for the persistence of this
controversy lies in the fact that the notion of inference has so evolved in the
study of vision that there is no single idea or empirical position associated
with the claim that perception is inferential in nature. I cannot, today, review
the tangled history that has led us to this stage, rather I would like to sketch
out five broad theses that have come to be equated with the claim that per-
ception depends on inference.
The alternatives that I have in mind are the following:
1. Sensation/Perception—There are two kinds of visual states, sensations

and perceptions, and perceptions are derived from sensations. On this ac-
count, what we are initially aware of in vision is a sensory core that maps
rather directly the spatial and light properties of the stimulus. Our percep-
tions of objective distance, shape, size, etc. are based on this prior sensory
state. This model makes two different empirical claims: first, that we have or
experience sensations as well as perceptions, and second, that sensations
cause, or lead to, perceptions. These claims are distinct. It is possible to accept
the existence claim, that there are these distinguishable visual states, but
deny that our perceptions depend on sensations. Gibson, for example, took a
line somewhat like this in his first book The Perception of the Visual World. *[In
this book (Boston: Houghton Mifflin, 1950), Gibson distinguishes the visual
world from the visual field, the latter his stand-in for sensations. In later work
he drops appeal to the notion of a visual field. In so doing, he solidifies his
own position. At the same time, this leaves him and his followers open to the
criticism that they have abandoned consideration of the actual phenomenal
experience of vision.]
96 Inference
2. Learning—Perception depends on learning. The main point of the anal-

ogy between processes of vision and those involving more ordinary cases of
inference is the idea that they both are based on inductively established
habits. For example, the fact that past instances of As have been experienced
or found to be Bs, leads us to assume that a new instance of A is also B. A visual
phenomenon is said to be inferential if it depends upon memory traces laid
down by previous experience. If we were physiologically endowed or in-
nately wired so that stimuli of a certain sort resulted in our seeing things the
way we do, without the input from past experience, there would be no reason
for thinking of such cases as analogous to our more standard cases of intel-
lectual or verbal inference. *[For example, H. Helmholtz, oft cited as the
founder of inference models, stresses the centrality of learning.]
On this account, the question of perceptual inference is intimately linked
to the debate over Empiricism versus Nativism. Although the claims have
often been linked, this learning hypothesis is separate from the sensation/
perception criterion of inference considered above. It is possible for some
phenomenon to depend on learning but not to involve two distinct visual
states, and it is possible for some phenomenon to be innately fixed and yet be
a two stage process.
3. Impoverished Stimulus—The stimulus or the information contained in
the stimulus is not rich enough to account for the perception. Vision requires
inference whenever it must elaborate on an impoverished input. Cases where
the stimulus is not lacking in this way are not inferential, since we can per-
ceive the layout veridically simply by attending to the information contained
in the stimulus.
To maintain this sort of visual inference claim it is necessary to establish
that the stimulus for the perception is impoverished. This requires, however,
that we have a reasonably clear understanding of what is means for a stimu-
lus to be impoverished. But this is a problematic notion. On the one hand, on
just about all definitions of the notion of a stimulus, the stimulus is not iden-
tical with the visual experience or judgement it gives rise to. Hence, it is not
sufficient, in and of itself, to account for the perception. On the other hand,
given the state of the organism at time, the stimulus is sufficient to cause the
perception to occur, and, in that causal sense, it is adequate to account for the
perception. Without some interpretation of “impoverishment” that lies in
between these extremes the impoverished stimulus version of inference is
The Role of Inference in Vision 97
either trivially true or trivially false. Unfortunately, there is no single mid-

ground construal of the notion that is widely accepted. Nor do most of them
readily match up with the versions of inference cited in either criterion 1 or
criterion 2.
4. Mental or Psychological Operations—Perception involves processes that
are distinctively mental or psychological in nature. Certain visual processes
are to be distinguished from the types of operations involved in the doings of
other organs, such as our heart, lung or liver. These visual processes are to be
described in terms of the interplay of ideas or mental states, whereas the lat-
ter cannot be appropriately characterised using intentional notions. They are
to be explained physiologically, not psychologically.
The problem facing this criterion is that there is not an agreed upon inter-
pretation as to what makes a state, or operation on that state, distinctively
mental or psychological. In early works on vision the notion of the mental or
psychological were usually explicated in terms of the manipulation of con-
scious ideas or in terms of learning, i.e. criterion 4 amounted to a version of
either criterion 1 or 2. More recently, with the rise of cognitive psychology and
computer models of cognition, the notion of the mental has widened so as
not to depend essentially on consciousness or learning. At the same time, it
has become less clear just what is to serve as a mark of the mental and where, if
anywhere, it may be possible to draw a principled line between psychological
versus purely physical or organic states. In any case, each expansion of the
notion of the mental or psychological underwrites a wider or different class
of operations that are inferential under criterion 4. It is not apparent, more-
over, that these alternative construals of cognitive states capture what it was
that originally made many proponents of inference models claim that visual
operations were analogous to what goes on in ordinary cases of inference.
5. Epistemological Approaches—Of the things we find out about by vision,
only some of them are really seen in an epistemological pure sense. There are
significant limits, then, to what we can really (simply, directly, immediately)
see. All the rest of what perception tells us about the world must be inferred.
But what are the limits on what can be “really” seen?
According to some theorists “real” seeing is restricted to an appreciation of
our subjective experiences or to sense data. Nothing we find out about the
external world is a matter of simply seeing, since, in principle, we can always
be deceived. For others, all that we can really see is color and light. For others
98 Inference
still, it is maintained that we can really see things in the environment, but
the class of items said to be seeable in this way differs widely on the various
accounts. *[See chapters 8 and 15.]
I believe this brief review of some of the competing interpretations of the

notion of visual inference can help explain why the problem of visual infer-
ence has seemed so resistant to a solution. The reason is that the claim itself is
multiply ambiguous, as well as often relying on distinctions that are vague
or lacking in specific empirical content. Thus evidence and arguments that
might count in favour of one version might count against another, and be
totally irrelevant to still others.
How then are we to deal with the problem of visual inference? My sugges-
tion is that we abandon it. There is nothing to be gained in attempting to
answer the question. Instead of trying to resolve the problem, we would be
better off refraining entirely from using the concept of inference in our theo-
ries of vision. For not only is the notion of inference ambiguous and unclear,
focusing on the question tends to distract us from the real empirical and the-
oretical problems that do face the study of vision. Even more perniciously,
perhaps, the controversy over inference often makes it seem as though there
are serious substantive issues at stake where there are none.
Many, I am sure, will find my suggestion to dissolve or let go of the issue
of visual inference most unsatisfactory. The reason, I think, is that there is a
widely held assumption, embedded in each of the versions of inference re-
viewed above, that has a powerful hold on people. This is the idea that some-
thing in vision must be given to our senses before the mind goes to work on
it. The given is the data or starting point. All else requires, in Jerome Bruner’s
words, that we “go beyond the given.” In turn, this distinction, between what
the mindless world thrusts upon us and what we intelligent beings add by in-
terpreting this evidence, is thought to have important implications for deep
philosophical and psychological doctrines about Mind and Reality, and
about whether we can know or be in direct contact with Reality.
Although this assumption of a distinction between the given and going be-
yond is pervasive, I believe it is not a distinction that can be made in the hard
and fast way needed to support any of these grand philosophical and psy-
chological doctrines. This lack of any firm basis for singling out as given one
particular stage in the chain of states that lead to perception is simply a reflec-
tion of our earlier problem of settling on a unique sense for the notion of
visual inference. For one way to look at the differences among the five criteria
outlined above is in terms of what each takes as given.
According to the sensation/perception criterion, what is given are sensa-
tions. On criterion 2, what is given are those visual phenomena that show no
influence of learning. On the third criterion, the given is identified with some
particular characterisation of the stimulus or the information contained in
the stimulus. On the mental operations criterion of inference, the given is the
first state in the process that is deemed to be psychological, as opposed to
being simply physical or physiological in nature. With the epistemological
criterion the given is what can be “really” seen. Each criterion, then, distin-
guishes between something given and that which goes beyond or is inferred.
The accounts differ over where to draw the line as to what counts as the data
to the visual system, but they each assume there is a unique line to be drawn.
I, however, see no principled way to make such a distinction, no way, that
is, to draw a principled distinction between what is given to us and what is our
contribution, a result of our supplementation. For the notion of our supple-
mentation, like the notion of the given, is nether firm nor fixed. Indeed, each
of the inference criteria we considered can be seen as spelling out a different
understanding of what constitutes our supplementation. On the first crite-
rion, there is supplementation when one idea triggers or otherwise leads to
another. On criterion two, supplementation occurs when the perceptual phe-
nomenon is the result of learning. With criterion three, supplementation is
what we provide over and above what is contained in the impoverished stim-
ulus. According to the fourth criterion, supplementation is a matter of opera-
tions on mental states or representations. Finally, the epistemological criterion
considers any perceptual judgement or experience to involve supplementa-
tion whenever it does not come up to the theorist’s particular standards of
epistemological purity.
The ideas of the given and supplementation march in tandem. What is given
is that which does not require our supplementation, and what is supple-
mented is that which we are not given. The problem is there is no one correct
way to draw these boundaries. In different contexts, for different purposes,
and to highlight different contrasts, it may be useful to settle on one inter-
pretation rather than some other. From the standpoint of the empirical study
of vision, however, we can make no general, non-arbitrary sense of the idea of
the input or the data of vision.
100 Inference
The question of visual inference resists dissolution, in part, because of the

lingering assumption that there must be some correct way to draw the line
between the world’s contribution and our own. Once the relativity of this
dichotomy, between the given and going beyond the given, is recognised, I
think it should be much easier to accept my earlier suggestion to abandon the
concept of visual inference. The notion of visual inference, in all its guises,
depends on a purported distinction between the data given, the premises,
and the perception or hypothesis achieved, the conclusion. The problem is
that these boundaries can with equal justification be draw in a variety of dif-
ferent places. There is, therefor, no one right way to distinguish the data from
the inferred. And if this distinction comes to be seen as optional, perhaps the
heated philosophical and psychological debates over whether perception is
direct or non direct, Realist or anti-Realist will also lose their attraction. *[This
attitude is part and parcel of the pluralism urged especially in chapters 8, 11,
14, and 15.]
Mental or Psychological Operations
According to this interpretation, to claim that vision involves inference is to

claim that vision depends on distinctively mental or psychological operations
and that it is not due to (or solely characterizable in terms of) purely physical
or organic processes. It is assumed, on this account, that everyone more or
less agrees that the end-state, the visual phenomenon or judgment, is itself a
mental state. The further claim is that the processes that bring about this end
state are themselves psychological. The processes of vision are thus to be dis-
tinguished from the operations involved in the functioning of our hearts,
lungs, and kidneys. These latter processes may be as, or even more, complex
than those underlying vision, but they do not involve mentality. Unlike vision,
they are not to be characterized in terms of the manipulation of thoughts,
ideas, or other mental states with intentional content.1
The reasons why this fourth criterion has led to a proliferation of positions
on inference are not hard to find. First, there is no clear, agreed upon under-
standing of what makes an operation mental or psychological. Second, some
theorists who adopt this criterion take it to be both necessary and sufficient
for inference, while others see it as only a necessary condition. I will look at
each of these issues in turn.
What does it mean for vision to involve operations that are distinctively
mental? In early works on vision this notion was often cashed in either in
terms of the manipulation of conscious ideas (such as sensations leading to
perceptual states) or in terms of learning. In more recent times, especially
with the rise of cognitive psychology and the development of computers and
computer models of cognition, the push to identify the mental with con-
sciousness or learning has largely diminished. But willingness to widen the
concept of the “mental” has only led to further complications in character-
izing the notion of “visual inference.” For as vague as these earlier ideas
may have been, nothing as circumscribed as consciousness or learning has
emerged to take their place as marks of the mental.2 What is more, if inference
is equated with mental operations in general, rather than with some specific
type of mental processing, then each widening of the notion of the “mental”
automatically generates an additional construal of “visual inference.”
Less obvious, but perhaps more significant, once the notion of the “men-
tal” is freed from its anchor in consciousness and learning, the very sorts of
intuitions that originally led many theorists to equate inference with mental
operations tend to be undermined. For the important point that these theo-
rists wished to make (or reject) was that vision involved higher-level, thought-
like states and processes, or that vision was affected by past experiences and
memory traces in the very way in which thought was supposed to be in-
fluenced. Vision, that is, involved the mind and mind-like intentional or
experiential states. The problem is that the extended characterizations of psy-
chological processing that have grown out of work in cognitive and com-
puter science often do not match up readily with these older conceptions of
what mental participation is taken to involve.
The issue emerges clearly in Shimon Ullman’s influential paper “Against
Direct Perception.”3 In this paper Ullman argues that we should consider
perception direct or immediate (and hence not inferentially mediated) if the
processes that transform stimuli into percepts can only be elaborated or ex-
plained in physiological terms. “If the extraction of visual information can be
expounded in terms of psychologically meaningful processes and structures,
then it can not be considered immediate.”4 Now although he gives no precise
specification of what constitutes decomposition of an operation into psy-
chological, as opposed to physical, constructs (other than that the character-
ization uses concepts found in psychology, not physiology), he is clear that
102 Inference
his notion of “psychological” processing is to be distinguished from what he

takes to be the more traditional views of mental operations. These psycho-
logical states and processes, he says, need not be conscious or accessible to
introspection or affected by experience or memory traces. Ullman seems to
suggest that it is enough that the operations involve computations on states
that can reasonably be construed or interpreted as symbolic or representa-
tional in nature. In fact, an example he uses throughout the article, as a para-
digm case of a kind of processing that can be fruitfully decomposed and
understood in psychological terms, is that of the workings of a simple calcu-
lator. For, he maintains, “certain events and components within the calcula-
tor can consistently be interpreted as having their meaning in the domain of
numbers and operations on numbers.”5
But if this is all that is required for an operation to be non-direct, not only
does it match up poorly with older traditional notions of mind, but it is diffi-
cult to see how it has anything to say about what makes such operations dis-
tinctively mental or psychological. Few, for example, might be tempted to
credit the pocket calculator with a mind or human-like intentional states
merely on the grounds that its internal states may be symbolic or semanti-
cally evaluable. More important, it would appear likely that the mechanisms
underlying the functioning of the heart, kidneys, and liver could also be char-
acterized fruitfully in representational and computational terms. At some
level of abstraction, a description of the workings of the kidneys may talk of
representations of volume, pressure, and electrolyte concentrations, and of
computations over these values. So unless the notion of what constitutes a
“psychological decomposition” is more strictly delimited, the intuitions about
mental versus organic operations that often underlie appeals to this criterion
of inference play no role.6
One way to salvage something of the original intent of this criterion would
be to distinguish the “symbolic” doings of calculators, kidneys, and livers
from those symbolic transactions that although also not conscious, intro-
spectible, learned, or dependent on public language and social practice are,
nevertheless, not purely physical. The criterion of mental or psychological
operations could then be extended to include any processing that involved
these “subpersonal,” “subdoxastic,” quasi-representational states. I do not
wish to get embroiled here, however, in the voluminous debates over where
and how to draw the lines between these various grades of representational or
intentional involvement, lines which I doubt can be drawn in any sharp and
useful manner.7 What should be apparent is that consideration of these sorts

of issues only serves to complicate further and to proliferate construals of the
claim that vision involves inference.
The second broad problem with criterion 4 is that although some theorists
consider the dependence on mental operations as both necessary and suffi-
cient for inference, others require more. Berkeley, remember, argued that not
all processing of mental items should be thought of as inference. For him, in-
ference was to be distinguished from suggestion, the simple triggering of one
idea by another. Similarly, an important question remains for those other
proponents of criterion 4 who do not wish to equate an inference model with
any kind of mental operation whatsoever. The question is this: Assuming
one’s favorite construal of the notion “mental operation,” what additional
features must a visual process display for it to be not only a mental operation,
but, specifically, a case of inference?
As best I can tell, little attention has been given to answering this question
in an explicit, detailed manner. The gloss usually found in the literature is
that a certain visual process deserves to be thought of as inference because
it is like everyday standard cases of intellectual inference. This latter claim,
though, does not provide much in the way of clear and concrete guidelines
for distinguishing among visual operations. We distinguish deductive from
inductive inference, and apply the term “inductive inference” all over the
place, to drawing generalizations on the basis of instances, confirming gen-
eralizations already drawn, reaching conclusions about an individual item on
the basis of other similar instances, coming up with the “best explanation” in
light of the totality of our evidence, assigning probabilities to singular or gen-
eral statements using any of a wide variety of sampling and statistical tech-
niques—indeed, to any sort of reasoning that is not taken to be deductive or,
in its widest use, to any activity that leads to an empirically established non-
necessary belief. The claim, then, that some visual operation is importantly
like intellectual inference is vague and ambiguous.
There is, in addition, an ambiguity in the idea that a visual process resembles
the process of intellectual inference, even when one particular type of intellec-
tual inference is singled out for comparison. In saying that an operation is like
intellectual inference of a given type K, we can mean something psychologi-
cally weak; namely, that the rules or principles that characterize valid verbally
articulated inferences of kind K can be used at an interesting level of abstrac-
tion to specify what the visual system accomplishes or attempts to accomplish.
104 Inference
Or we can make a psychologically stronger claim and assert that operations

analogous to those that actually go on in our heads when we make inferences
of type K also take place in visual processing. An example may help clarify
the point. Suppose K is deductive intellectual inference. In describing such
mental activity, we normally distinguish between using the rules of logic to
characterize certain formal relations between premises and conclusion and
characterizing the actual steps and operations that transpire in the person’s
brain/mind when drawing deductive conclusions. Usually, in this case, we do
not assume that the steps in the formal derivation describe real-time stages in
the mental derivation. The rules of logic are not employed to make a strong
psychological claim about processing. In order to evaluate a claim that vision
is similar to type K inference, then, it is necessary to know whether it is a weak
or a strong comparison that is being made.
Criterion 4 thus offers no one simple interpretation of the claim that vision
involves inference. First, it awaits a principled account of what makes an op-
eration distinctively mental or psychological. Second, if merely being a men-
tal or psychological operation is not sufficient for a claim of inference, it
becomes necessary to be more specific about what kind of inference is being
held up as a model, and whether the claim is one of weak or strong psycho-
logical characterization, or something in between. Finally, if, as seems to be
the case in a lot of the literature, the claim is one of strong description, a state-
ment about actual processing, then it probably makes sense at this stage of
our understanding of cognitive activity to abandon the idea that what makes
it appropriate to call a visual operation inferential is that it resembles what
goes on in intellectual reasoning. For if we mean by this that the visual oper-
ations are significantly like the operations underlying these intellectual func-
tions, then evaluation of the claim will have to await our having reasonably
good theories about how these intellectual feats are performed. The problem
is that, at present, our understanding of the visual system is probably on a
firmer footing than our understanding of the mechanisms that mediate in-
tellectual reasoning.
Notes
This paper is based on a much larger work on perceptual inference. In order to fit within
the time allotted, I am going to have to skip many of the details and much of the sup-
porting arguments. What I present here are just the main themes of that longer work.
*[VVBT.]
1. Although I tend to use the terms “mental” and “psychological” interchangeably, the
concepts are not equivalent for all theorists.
2. Various of my subsequent points about the lack of fixity of the notion of “visual
inference” are related to the current discussion regarding consciousness and “the”
time and place of conscious events (see Daniel Dennett, Consciousness Explained. Little,
Brown: Boston, 1991). Tracing these connections would take us far afield from the
present study.
3. Shimon Ullman, “Against Direct Perception,” Behavioral and Brain sciences, 3 (1980),
pp. 373–415.
4. Ibid., p. 374.
5. Ibid.; see around pp. 375 and 380.
6. Ibid., p. 380. Ullman’s suggestion (ibid., p. 374) that the distinction between what
can and cannot be decomposed may be “relative to the system under investigation”
and “expresses a point of view” about “one’s domain of interest” would seem to fit with
views I develop concerning the optionality of the inference/non-inference dichotomy.
7. See my article “The Problems of Representation,” Social Research, 51 (1984), pp. 1047–
64. The issue has become even more otiose with the development of connectionist
models of cognition and debates over whether these models appeal to “real” represen-
tations. See Paul Smolensky, “On the Proper Treatment of Connectionism,” Behavioral
and Brain Sciences, 11 (1988), pp. 1–74, and the subsequent criticisms, countermoves,
and counter-countermoves.
Prescript 7
Since ancient times occlusion has been considered a definitive indicator of

relative depth, and it is still cited as an important depth cue in papers and text-
books on perception. It does not, require elaborate theories of vision or optics
to appreciate that when one object blocks off another from view, the occlud-
ing surface is nearer to the observer than the part of the object being occluded.
Chapter 2 calls attention to overlooked features of Berkeley’s account of
size perception. Berkeley argues that it is misleading to think of size being de-
termined on the basis of distance estimates, because distance and size cues are
one and the same. In VVTB a case is made for an even tighter connection be-
tween estimates of size and distance. Not only are the cues the same, mea-
sures of size and distance themselves are inextricably entwined. Hence, the
perception of size and distance should not be treated independently as they
typically are.
Chapter 7 develops similar themes with respect to occlusion and depth
evaluations. It explores the implications of collapsing the assumed difference
between cue and conclusion, between the given (occlusion) and the taken
(relative depth). Consequences of this analysis for the proper conception
of related phenomena of visual supplementation are considered. Although
“cues” and “supplementation” are notions usually associated with inference
theories of perception, the issues raised here can be explored without being
drawn into suspect debates over visual inference.
7 Making Occlusion More Transparent*
Near objects may partially obscure far objects; the converse is never true. Hence the
mind seizes’ upon the interruption of one object at the boundaries of another as a cri-
terion of the relative distance of the two objects. The interrupted object is farther away.
The circumstances attending the discovery of this principle are lost in antiquity.
Boring (1942, p. 264)
Interposition—the cutting off of part of the view of one object by another—is an ex-
traordinarily potent cue to relative distance. The partially occluded object is always
seen as behind the nearer object.
Kaufman (1974, p. 230)
When one object partly occludes another, the occluding object is perceived as closer
and the occluded object as further.
Palmer (1999, p. 236)
If an opaque body intercepts a line of sight, it prevents light rays from any-
thing behind it reaching a viewer’s eyes. Given minimal assumptions about
light taking a straight path, it follows that any item so occluded must be far-
ther from the viewer than the interposed opaque body itself. Thus occlusion
(also referred to as interposition, superposition, or overlap) seems to carry im-
portant and unequivocal information about the spatial layout. Moreover, it
seems to provide this unambiguous depth information in any direction and
over any distance in which visual perception functions (Cutting and Vishton
1995). Whether near or far, straight ahead or off to a side, it is always the case
that if the occluding object {O}, actually occludes an object {A} from a subject’s
{S} view, A is farther from S than O. It is not surprising, then, that occlusion
has long been taken to be a major cue for depth perception. Nor is it surpris-
ing that occlusion has been thought to be one of the artist’s most effective
110 Inference
tools for rendering depth pictorially. If you wish to indicate A is behind O, a

depiction that shows O occluding A cannot fail to make the point.
There is no gainsaying the geometry and optics of the situation. If O ob-
structs A from S’s view, the obstructing part of O must be nearer to S than the
obstructed surface of A. Still, I think the role and significance of occlusion as
a distance cue is more problematic than is often assumed.1 When considering
these issues, however, it is important to keep in mind the information occlu-
sion does not make available. Occlusion provides no metric or absolute dis-
tance information. It is not a cue to the specific distance that either O or A are
from S. All it can indicate is relative distance. If O occludes A from S’s view, A
must be behind O. It does not indicate how far A is behind O. The occluded
A may be flush against the back of O or miles behind.2 These limits on the
information interposition makes available are well known, and are not my
interest here. My concerns have to do with: (a) should occlusion be charac-
terized as a cue, (b) can occlusion actually provide useful depth information,
and (c) what do answers to these questions imply about the nature of the
relationship between visual interpolation, or as it is sometimes called, “a-
modal completion,” and occlusion.
The Problem of Circularity
By definition O occludes A when an opaque O stands between the subject S

and the object A, so as to block the light rays from A reaching S. In order for
S to employ occlusion as a cue to depth, then, S must take heed and register the
fact that O is interposed between S and A. S, that is, has to make use of the in-
formation that O occludes A. But this cannot be right, since the judgment
that O is interposed between A and S itself constitutes an evaluation of the
depth relation. To perceive that O occludes A is to perceive that O is located in
front of A. Hence, occlusion is not serving as a cue to the relative depth of A
and O, but rather a judgment of occlusion is an evaluation of a depth relation
between O and A.
Similar difficulties affect the application of other characterizations of the
cue. For example, Cutting and Vishton (1995) say “Occlusion occurs when
one object hides or partially hides another from view,” while Levine and
Shefner (1991) talk of interposition in terms of one figure blocking another.
But to determine that O hides or blocks A from view requires or presupposes
a decision that O comes between S and A. Once again, a judgment of occlu-
Making Occlusion More Transparent 111
sion is tantamount to an assignment of relative depth. Taking O to block or

hide A is to presume that A is behind O. So the circularity problem remains.
J. J. Gibson made these points some time ago. In his groundbreaking book,
The Perception of the Visual World, Gibson argued that “The covering of a far
object by a near one . . . cannot explain depth perception . . . since it presup-
poses the phenomenon which it seeks to explain—one object behind an-
other” (p. 137).3 Gibson maintained, therefore, that it is a mistake to think of
occlusion as providing relative depth information. Instead, he said, the fun-
damental question is “How do we see depth at a contour so that one side of
it appears near the other far?” (p. 137). He goes on to suggest as a principle that
“the more complete, continuous, or regular outline tends to be the one which
looks near” (p. 142). For Gibson the appreciation of occlusion—the percep-
tion of one object as obscuring or hiding another—is something that needs
to be explained. Occlusion is not a cue to distance and should not be assumed
to explain depth perception. Perceived occlusion is a description of how the
layout looks. Toward the end of this same book Gibson summarizes his views
on the matter. “The visual superposition or overlapping of surfaces . . . is an
important type of depth perception, not a cue for depth perception” (p. 228).4
In spite of Gibson’s insightful analysis, occlusion is still cited prominently
in texts and papers as a cue to depth, albeit with an occasional nod in the di-
rection of the issues raised above. A growing number of theorists, though,
have taken up Gibson’s challenge and have focused their research on his fun-
damental question. They have attempted to explain how edges, boundary
contours, and other information help determine which, if either, of two ob-
jects is perceived to occlude the other. (See, for example, Kellman and Shipley
1991 and for an update, Palmer 1999.) Nevertheless, in this work, as in Gib-
son’s, the idea seems to remain that occlusion borders do have at least a deriv-
ative role to play in determining depth relations. When considerations of
completeness, continuity, or regularity of outline determine that it is O that
occludes A, it also sets their relative depth ordering. As Boring says, “The in-
terrupted object is farther away.” But is this really so?
The Dilemma
If O actually occludes A from S’s view, O is nearer to S than A. This is not in

doubt. Less clear is what follows from this optical and geometrical observa-
tion. Closer examination of several spatial layouts will highlight the issues.
112 Inference
Consider the most trivial case, where A is small enough and so located that O
occludes it completely. In this circumstance, there will be nothing of A for S
to see, and no O/A contour information for S to register and use in reaching
an occlusion judgment. So unless there is some other source of information
to indicate A’s presence, O’s occluding A will prevent S from seeing or being
visually aware of A. Total occlusion is obviously more a hindrance than an aid
to relative depth perception.
Next, consider an effect interposition may have when O occludes only part
of A, leaving the rest visible. As figure 7.1a shows, occlusion of A by O may
lead to A’s being perceived further from S than when it is not occluded. But as
figure 7.1(b) shows, occlusion may cause A to be seen nearer to S than before.
These bidirectional effects on the perception of A’s distance need not be con-
sidered a problem, of course, since occlusion is only claimed to furnish ordi-
nal depth information. Phenomena like those figures 7.1 a and b exhibit do
not challenge the idea that the occluding object itself is always nearer than
the object occluded
But is it true that the occluding object is always nearer than the object oc-
cluded? The apparent a priori status of this claim trades on an ambiguity
(a)
O
A
(b)
Figure 7.1
in the specification of A. Take a standard case of occlusion: A and O share a

boundary, part of A is visible, and contour information makes it appear that
it is O that obstructs A. What does this tell us about the location of A with re-
spect to O? The answer is that it depends on which parts of A and O are being
considered. The depth information occlusion provides strictly applies only
to the part of A actually occluded, the part that is literally out of sight. The
most occlusion entails is that the part of A we cannot see is more distant than
the occluding part of O we can see. As for the location of the visible parts of A,
occlusion, in and of itself, is non-committal. Contour information indicating
that it is O that obstructs part of A is compatible with the observable sections
of A lying in front, in back or alongside of O.
The point, here, is not that occlusion furnishes only ordinal information.
It is rather that without other assumptions, a judgment of occlusion does not
entail relative depth information about the location of those parts of A that
are not specifically occluded from sight. It is simply not true that all of the in-
terrupted object must be farther away than the interrupter. The visible part of
A, including that immediately adjacent to the occluding border, may be at the
same distance or closer to S than O. Such cases are not mere exceptions to an
otherwise valid rule. They are to be found everywhere. In most natural envi-
ronments parts of occluded objects are nearer to observers than those doing
the occluding, and typically people have no trouble perceiving these arrange-
ments correctly.
We see, for example, our friend Corin leaning out of her car window. Her
head, arms, and upper torso are visible and are perceived as being closer to us
than the part of her body that is occluded by the car door. We notice, too, that
she is wearing an attractive necklace whose clasp and adjacent links are oc-
cluded from view. Nevertheless the visible remainder of the necklace is seen to
lie in front of the occluding edge of her neck. And the story can be readily ex-
tended. Sitting across from Corin at lunch, she strikes many poses, assuming
a variety of depth relations to the table, the objects on it, and her chair. Some-
times the visible surfaces of the occluded objects (Corin, the furniture, table-
ware) are nearer than the occluding surfaces, sometimes further, and these
depth relations are readily perceived. After lunch, Corin poses for a photo in
front of a tree. Its occluded trunk is perceived behind her, yet many of the tree
branches are and are seen to be closer to us than Corin. Corin’s dog stands
nearby, his body obstructed by the tall grass. His head, though, peers forward
over the grass top, and it is perceived as nearer to us than the occluding grass.
114 Inference
Changes in an observer’s angle of regard with respect to O and A can also af-
fect relative depth and its perception.5 Standing squarely in front of a paint-
ing hung on a wall, both the section of the wall the painting occludes and the
observable sections of the wall on either side of the painting are further from
S than the painting. If S moves enough to one side, however, the wall on that
side may be closer to S than the painting, and can be veridically perceived as
such. Or consider a knife stuck in an opaque object. The tip of the knife is oc-
cluded by the embedding surface. Depending on S’s angle of regard, the vis-
ible knife handle may be and will usually be seen by S to be closer than the
occluding surface. More generally, surfaces of attachment provide constant
obvious examples where the visible parts of A are and are perceived to be
closer to S than O. Viewed from in front, Corin’s house occludes portions of
the ground immediately behind it. The ground surface lying immediately in
front of the house, nevertheless, is perceived as nearer than the occluding
edge of the house.
Whatever limitations edges or contour boundaries have in supplying
depth information about the visible part of A, it may seem safe to assume that
it provides definitive depth information about the part of A lying within the
occluding border. Obviously, this claim, too, must be tempered. The infor-
mation occlusion borders make available is entirely local to the boundary. The
most such contours entail is that if A continues on, A is behind O at that very
point of superposition. Beyond that, occlusion at an edge does not imply any-
thing about the location of the remaining parts of A, within the boundaries of
O. They may and may be seen to emerge at any place through, above, or be-
low O. In summary, environmental layouts where the visible parts of A are
closer than O is to S are ubiquitous, and people tend to have no trouble seeing
the relative depth relations correctly. Alternatively stated, the visible part of
the incomplete, non-continuous, irregular outlined A is often closer to S than
the complete, continuous and regular outlined O and will be so seen.
Responses
I have offered a perhaps overly large number and variety of instances of S, O,

and A arrangements in order to stress how common the situation is, and how
pervasive is the problem it poses for blanket claims about completeness or oc-
clusion information being a source of relative depth judgments. For it can be
quite tempting to think that layouts in which visible parts of A are nearer to S
than O are aberrations or exceptions to an otherwise sound principle. But as

the cases cited highlight, there is nothing very special or peculiar about such
layouts, and the relevant optical and geometrical considerations are straight-
forward. In turn, solutions to the problem that focus on a single kind of case
will be hard to generalize and do not seem to get to the heart of the issue.
A related reason for my plethora of trivial examples is that vision scientists
currently are inclined to think of spatial cues in probabilistic terms. Cues are
not all or none indicators. Their influence on perception is a function of the
probabilities of co-occurrence in the environments most usually encoun-
tered. On this account, as long as layouts where the visible parts of A are
closer to S than O are of low probability, there will be no difficulty accommo-
dating them. The accepted understanding of occlusion information might,
then, require a little statistical tweaking, but it would not be in need of radi-
cal rethinking.
Given the pervasiveness of layouts where parts of A are nearer than O is to
S, I doubt appeals to probability can be the answer. I am, however, not in a po-
sition to prove my case with data based on a representative sampling of envi-
ronments. Moreover, it is not at all clear how the relevant probabilities are to
be characterized and computed, unless the space over which they are defined
is severely constrained. One might, for instance, restrict the role of occlusion
information to cases where O and A are both far off in the distance and are
both on a fronto-parallel plane. These restrictions will make a probability
analysis more tractable. Unfortunately, they will also severely narrow the use-
ful scope of occlusion information, and they are likely to render any available
information redundant.
An alternative way to restrict the damage the counterexamples pose would
be to appeal to the notion of an object. Occlusion information plays a role,
but only within the confines of single objects. For example, layouts involving
the ground, ceiling, walls, and other surfaces of attachment, are not relevant,
because these surfaces are not “objects.” This approach, of course, presup-
poses an acceptable, relevant notion of “object,” and I am not sanguine that
there is one. (See chapter 12.) In addition, the approach cannot go very far in
eliminating counterexamples. If, say, the ground is not an object, a broom
handle lying on the floor and sticking out from under a couch can serve to
make the same point. Moreover, only a very ad hoc conception of an object
could rule out cases involving necklaces, tables, trees, dogs, knives, and the
like as discussed above.
116 Inference
Another idea a reader floated is to claim that occlusion can and does pro-
vide useful depth information, but only when (1) A is at a significant distance
behind O at the occluding border and (2) the information occlusion affords is
limited to those visible parts of A not far from that border. Now the optics and
geometry of (1) and (2) do ensure that this claim is correct or at least proba-
bilistically correct—cases of transparency, discontinuous objects, and non-
generic alignments are the exceptions. The problem with solutions such as
this is that they involve a circularity similar to the one Gibson warned of.
What evidence can S have for assuming that A is a significant distance be-
hind O at the occluding border? By definition, the occluded part of A is out of
sight. So it cannot be a source of information that A is far behind O at this
point. It is the visible part of A that must play the role. To serve its purpose,
S has to determine visible A’s depth with respect to O. But then S will have
already discerned the depth relations in question (the relative depth of O
and the expanse of A that can be seen) independent of information gleaned
from occlusion.
Analysis
The reason interposition effects on depth perception are varied need not be a
mystery. Placing O in a position to occlude A has a range of consequences. It
alters the availability and interpretation of information coming from other
stimulus variables (for example, height in field, texture gradients, attachments,
slant indicators, etc.) that are relevant to perceiving depth. (See figure 7.1.) In
turn, the effects occlusion has on the perception of spatial relations will
neither be uniform nor unidirectional.
The physical occlusion of parts of one object by another is to be found
everywhere we look. Indeed, every three-dimensional opaque object hides all
but its own facing surface. Therefore, the visual system constantly interpo-
lates, a-modally completes, rounds out, and fills in its visual world. Contour
boundary information is one significant goad or stimulus for such supple-
mentation. It is misleading, however, to think that simply distinguishing the
occluder from the occluded provides a unidirectional indicator or source of
information about the relative depth relations of their visible parts.
It goes without saying that visually supplemented content must be placed
or situated somewhere. When contour information prods the visual system
to supplement the scene, the relative depth of the a-modally completed part
of A to the occluding O can not be left undetermined. It comes along for the
ride. Perceptual construction must assign it a location. Hence, it is tautologi-
cal that a supplemented occluded item is perceived to be behind its “occlud-
ing” O. Where else could it be?
Supplementation, though, can have the opposite effect on perceived depth
relations. When what is supplemented is seen and opaque, it will be an oc-
cluder and not the occluded. For instance, in cases of apparent motion or sub-
jective contours (figure 7.2), the seen interpolated perceptual content often
does the occluding. That a supplemented visible surface is not itself occluded
goes without saying. This claim, too, is tautological.
Considerations such as these would seem to indicate that it is better to think
of edges and contour boundaries as stimuli for supplementation rather than as
providing independent information for judging depth. The depth relations,
after all, are of necessity determined by the nature of the supplementation. In
suggesting this coupling of depth, supplementation, and occlusion, I do not
wish to suggest that there is a causal order among them or that they are separate
phenomena. The phenomena are two sides of the same coin. Figure 7.3 pro-
vides an illustration of what I have in mind. If line (a) is perceived as lying on
the frontmost plane, it occludes (b) and (b) is a-modally completed at the point
of intersection. If the perception switches and line (b) is seen on the frontmost
plane, it occludes (a) and (a) is supplemented at the place where they cross.
These perceptual reversals though each occur as a package deal. When the per-
ceived depth relations change, so do the experiences of supplementation and
occlusion. Or one might equally hold, when occlusion and supplementation
Figure 7.2
118 Inference
Figure 7.3
relations change the depth relations perforce change with them.6 As Gibson
says, “The visual superposition or overlapping of surfaces . . . is an important
type of depth perception, not a cue for depth perception” (p. 228).
Conclusion
Much remains to be said about occlusion, supplementation, and depth and

the way they should be incorporated within a comprehensive theory of spa-
tial perception. Any attempt to do so would take us far beyond the goal of this
paper. I believe that the analysis offered above, though, does make a case for
the following:
1. As Gibson argues, it is circular to consider occlusion a cue to relative depth.

2. The optical occlusion of A by O affects a variety of stimulus features that
interact and can alter perceived depth. The resulting depth order effects will
thus vary from one layout to another.
3. Models of cue integration or cue weighting must not assume the effects
will always have the same directional valence.
4. Visual supplementation takes many forms. Sometimes it adds an occluded
surface, and sometimes fills in with an occluding surface. Either way, depth
relations are settled in the process.
5. Work on size perception (See chapter 2 of this book and VVBT) suggests
that perception of size and distance come as a package and that it is a mis-
take to claim that size perception depends on a prior or independent evalua-

tion of distance. It would seem that a similar approach may be called for in an-
alyzing depth perception, completion phenomena, and the mechanisms of
supplementation.
6. As in the case of size perception, the prevalent practice in experiments on
spatial perception is to use illustrations of simple two-dimensional figures,
appearing on fronto-parallel planes. The experiments are not run in real
environments, where geometrical and optical considerations are more
complicated. This practice, I think, obscures the conception and approach
to the problem and is one reason why depth relations between occlusion
borders and the visible parts of occluded objects have not been thoroughly
investigated.
Nowhere have I argued that edge, contour boundaries, and other informa-
tion resulting from optical occlusion have no role to play in depth percep-
tion. The point is that its effects are complex and not unidirectional. As
Boring correctly remarked, the intuition that occlusion is a strong cue to
depth relations traces its history back to antiquity. Nevertheless, its empirical
and theoretical significance remain to be seen.
Notes
* I wish to thank James Cutting, Heiko Hecht, Larry Mahoney, and Tim Shipley for
comments.
1. The analysis in this paper is limited to occlusion in static scene perception. Related
issues concerning accretion or deletion phenomena that occur with movement are not
discussed. I believe the analysis does have implications for these dynamic cases, but it
would unduly complicate matters to deal with them here. Note, too, motion based ac-
cretion and deletion, per se, have no part to play in picture perception (See readings in
section III).
2. For the use of ordinal information to derive more metric information see Shep-
ard (1980).
3. See also Ratoosh (1949) for an earlier indication of similar misgivings and Landy,
et. al. (1995) for more recent qualms.
4. Gibson sees his analysis of occlusion as part and parcel of his overall project of show-
ing that perception is direct. Gregory (1990), on the other hand, claims that occlusion
and related phenomena show that perception is indirect. My own view (see chapters 6
and 8), is that there is nothing much to be gained by entering into this controversy.
120 Inference
5. The problems here are quite similar to those explored in my account of size percep-
tion (Schwartz 1994) once slant is factored in.
6. Kellman and Shipley (1991) provide comparable examples in their demonstrations

of the the interrelations of a-modal completion, subjective contours, and occlusion.
Nakayama et al’s (1995) experiments on stereoscopic depth reversal and occlusion do
so as well.
References
Boring, E. G. (1942). Sensation and Perception in the History of Experimental Psychology.

New York: Appleton-Century-Crofts.
Cutting, J. and P. Vishton, (1995). “Perceiving Layout and Knowing Distances: The in-
tegration, relative potency and contextual use of different information about depth.”
In Perception of Space and Motion, W. Epstein and S. Rogers (eds.). San Diego: Academic
Press, pp. 69–117.
Gibson, J. J. (1950). The Perception of the Visual World. Boston: Houghton Mifflin.
Gregory, R. (1990). “How Do We Interpret Images.” In Images and Understanding, H. Bar-

low, C. Blakemore, and M. Weston-Smith (eds.). Cambridge: Cambridge University
Press, pp. 310–330.
Kaufman, L. (1974). Sight and Mind. New York: Oxford University Press.
Kellman, P and T. Shipley, (1991). “A Theory of Visual Interpolation in Object Percep-

tion.” Cognitive Psychology 23, pp. 141–221.
Landy, M., L. Mahoney, E. Johnston, and M. Young, (1995). Measurement and Mod-
eling of Depth Cue Combination: In Defense of Weak Fusion. Vision Research 3,
pp. 389–412.
Levine, M. and J. Shefner, (1991). Fundamentals of Sensation and Perception (second edi-
tion), New York: Pacific Grove: Brooks-Cole.
Nakayama, K., Z. J. He, and S. Shimojo, (1995). “Visual Surface Representation.” In

S. M. Kosslyn and D. N. Osherson (eds.) Visual Cognition (second edition). Cambridge:
MIT Press, pp. 1–70.
Ratoosh, P. (1949). “On Interposition as a Cue for the Perception of Distance.” Proceed-
ings of the National Academy of Science 35, pp. 257–259.
Shepard, R. (1980). “Multidimensional Scaling, Tree-fitting, and Clustering.” Science

210, pp. 390–98.
Prescript 8
J. J. Gibson’s theory of direct perception sets the stage for most current dis-
cussions of perceptual inference. Gibsonians deny the need to appeal to in-
ferential processes in each of the guises spelled out in chapter 6. (For their
particular conception of the processes of learning, see J. J. Gibson and E. J.
Gibson, “Perceptual Learning: Differentiation or Enrichment,” Psychological
Review 62 (1955), pp. 32–41.)
James Cutting is especially sensitive to the ambiguities and unclarities with
the notion of “inference” encountered in the writings of both direct and in-
direct theorists. In a series of papers, Cutting tries to sharpen the terms of the
debate, in order to give it more empirical content. He proposes as well his own
model, one that he labels “directed perception.” Chapter 8 examines Cut-
ting’s analysis of the problem of inference and the contribution his directed
model can make to settle it. In spite of the interesting empirical and theoreti-
cal features of Cutting’s account, doubt remains that his proposal can give
substance to most ongoing disputes over perceptual inference.
8 Directed Perception
Background
Perhaps the most debated topic in the theory of vision has and continues to
be the question whether perception is direct or indirect. Although the issue
has a long history in both the philosophical and psychological literature,
it took on new dimensions and significance with the pioneering work of
James J. Gibson. Beginning with his book, The Perception of the Visual World
(1950), Gibson argued that progress in the theory of vision had been and
was being hampered by an impoverished, atomistic conception of the
stimulus. The central problem of perception was taken to be that of ex-
plaining how we come to see the world on the basis of the limited infor-
mation contained in the point values of light striking the retina. Gibson
demonstrated that if this elementaristic view of the stimulus is abandoned
and attention paid to higher-order properties of the retinal image, espe-
cially ratios and invariants in the light array resulting from movement, the
information available for perception is greatly expanded. In turn, Gibson
maintained that this richness of information made it possible to see the envi-
ronment directly. Contrary to received opinion, there is no need for a subjec-
tive mental contribution by the perceiver to mediate and hence stand in the
way of our access to reality. We can simply see the objects and properties in
the environment.
Nowadays, Gibson’s ideas concerning the importance of higher-order
properties of the stimulus to the study of vision are not in doubt. What has re-
mained most controversial and most contentious is Gibson’s further claim
that an expansion and reconception of the available information shows that
perception is direct.
124 Inference
A third alternative
Recently another contender has entered into this debate. In an influential

book and a series of articles James Cutting has argued for a position he calls
“directed perception” (Cutting, 1986, 1991a, 1991b, 1993; Bruno & Cutting,
1988). Although inspired by Gibson’s theory of direct perception, Cutting dis-
tinguishes his own account both from Gibsonian and neo-Gibsonian models
and from those of their opponents who maintain that perception is indirect.
This paper focuses on an examination of Cutting’s theory of directed percep-
tion. Its ultimate goal, however, is more far-reaching. It is to challenge the very
status of the time-honored controversy Cutting’s model is meant to resolve.
Before proceeding, a word of caution about the propriety of labelling any
particular model of perception “Gibson’s” or “Gibsonian.” Gibson’s theory of
perception evolved considerably, and his characterizations of direct percep-
tion changed along with it. In addition, Gibson’s numerous discussions of
the direct/indirect distinction are not always clear, precise or consistent. It is
not surprising to find staunch proponents as well as critics of Gibson’s ideas
at odds over just what Gibson’s thesis amounts to. Cutting is well aware that
his own characterization of Gibsonian positions is not the only one possible.
His explication, however, is one that is widely cited and employed by neo-
Gibsonians and other parties to the dispute.
The mathematical/empirical findings Cutting relies on to support his
doctrine of directed perception are of two sorts. First, it is argued that geo-
metrical analyses show that for a range of visual phenomena the information
available in the stimulus not only adequately specifies what is perceived but
overspecifies it. Natural environments provide multiple sources of informa-
tion, each of which completely specifies the object, event or aspect of the
layout perceived. Second, it is maintained that empirical studies reveal that
perception often depends on a selection from or combination of these re-
dundant sources of information.
Suppose it is granted that both of these points are correct. What then fol-
lows about Cutting’s thesis that perception is directed? The answer, I am afraid,
is substantively little. For it can be shown that Cutting’s geometrical analyses
and experimental work are largely independent of the claims that: (i) such
results refute a doctrine of direct perception, and yet (ii) do not entail that
perception is indirect. Moreover, the fact that these claims are in this way in-
dependent of the empirical findings is quite consequential. It raises serious
Directed Perception 125
questions not only about the interpretations of (i) and (ii) but of the actual
relevance of such theses to the study of perception.
Inference, premises and learning
In describing his own position Cutting allies himself with the Gibsonians,
rejecting the idea that perception involves a mental contribution and thus is
indirect. Cutting’s grounds for this initially seem stronger than Gibson’s. Gib-
son argued that there is no need for the perceiver to “go beyond the given” be-
cause there is sufficient information in the stimulus to specify the layout.
Cutting adds that, in many situations, there is not only sufficient informa-
tion, there is an overabundance of it.
As the continuing controversy indicates, these Gibsonian-inspired claims
that the stimulus is adequate for specifying the layout and that this adequacy
means that perception is not indirect, have not proven compelling. Elsewhere
I have argued that such failures to settle the dispute are only to be expected,
since, as commonly conceived, the very distinction between direct and indi-
rect perception has no clear content or empirical import (Schwartz 1994).
Attempts to give the distinction real bite depend and flounder on vague
intuitions about the nature of the mental or intentional, dubious assump-
tions about consciousness, and inadequately-motivated characterizations
of notions such as “the given,” “stimulus impoverishment,” “transducers,”
and the like.
Cutting is sensitive to many of these issues. He appreciates the need to for-
mulate the idea of “stimulus adequacy” in more precise terms (see next sec-
tion). And in sharp contrast to most writers, he recognizes that an appeal to
the notion of “inference” cannot by itself serve to separate direct from indi-
rect approaches. With little or no alteration, all of the competing theories,
his own included, can be (re)described as inference models (Cutting 1991a).1
Nonetheless, Cutting believes there is a significant difference between direct
or directed theories and indirect theories. A theory is indirect, he says, if it
holds that cognition plays a role in perception. For Cutting, though, the char-
acterization of perceptual tasks and accomplishments in inferential terms
does not show that they are cognitive. More is required. Cognition is impli-
cated only if the premises involved in the inference are “in the mind.”
But what does it mean for a premise to be “in the mind”? Traditionally, the
idea of something’s being “in the mind” was understood to mean accessible to
126 Inference
or in consciousness. With the breakdown of the identification of psycholog-

ical or mental states with conscious states in cognitive science such a reading
is no longer very useful for separating direct from indirect theories.2 Cutting,
in fact, describes the premises of indirect perception as being “hidden” in the
mind, indicating that he too does not require that cognitive states be intro-
spectively accessible. He stipulates, instead, that premises are to be consid-
ered in the mind if they are learned or established inductively. Premises are
not located in the mind, but in the brain or visual system, if they are hard-
wired as part of our biological endowment. Cutting provides yet another lo-
cation for premises that mathematically characterize relationships between
the layout and stimulus (such as the cross ratios of a rigid configuration of
moving parallel lines is a constant). They are said to “hide” in the object or the
stimulus. These computational premises serve to underwrite an explanation
of the success of an inference—why a given sort of perceptual inference re-
sults in veridical perception—but they play no role in the actual processing.
(See Marr 1982.)
Debates over innateness have a long, convoluted past, and there is good
historical justification in Cutting’s linking claims of indirect perception with
theories that stress learning (Schwartz 1994). Helmholtz, for example, the
person usually cited by psychologists as the father of modern indirect theories
of perception, surely had this in mind. Helmholtz repeatedly characterized
his disagreement with his opponents as a dispute over the role of learning
in vision. And in response to criticisms of his use of the term “unconscious
inference,” to describe his own model of perception, Helmholtz later pro-
posed “inductive conclusion” as a more perspicacious description (Helm-
holtz 1968, pp. 255ff).
Still, in the context of present work on vision, Cutting’s alliance of indirect
and learned is much less warranted. First, proponents of what are standardly
cited as mental processing models of perception often hold that the “cogni-
tive” structures they propose are biologically endowed. Indeed, strong na-
tivist assumptions have become one of the hallmarks of current cognitivist
theories. Second, Gibson and Gibsonians allow and in many cases actually
maintain that direct perception depends on learning. It may require experi-
ence in order to come to appreciate the rich information that is available in
the stimulus. Finally, and most significantly, Cutting’s claim that perception
makes use of multiple sources of adequate information is an entirely separate
issue from whether the premises are inductively established or innate. The
use of insufficient types of information may be hard-wired, while the ability

to take advantage of adequate or redundantly adequate information may be
the result of learning. Therefore, if the direct(ed)/indirect distinction is to
have an eminent place in today’s marketplace of ideas, it cannot be not be
drawn in terms of innateness.
Stimulus adequacy
Cutting offers another account of the differences among alternative models,

an account that depends more immediately on views about the nature of
the stimulus. He proposes to specify the adequacy or inadequacy of the avail-
able information on the basis of whether the relationship between a given
environmental property type and the corresponding stimulus type is many-
one, one-one, or one-many. According to Cutting, perception is to be under-
stood as being indirect if the relation is many-one. This would be the case if
the information available to the perceiver is compatible with more than one
possible real world situation. Then, the stimulus underdetermines the lay-
out, and there is need for cognitive work to supplement the less than ade-
quate evidence.
As Cutting explicates it, the (neo)Gibsonian position is that the stimulus
is not insufficient or impoverished. There is a one-one correspondence be-
tween the layout and the available information. Once the perceiver is able to
make use of this information—either due to innate endowment or as the re-
sult of learning—there is no need for mental supplementation. Perception of
the environment is direct because we are provided with information that
adequately reflects how things are.
According to the directed model the relationship between layout and stim-
ulus information is one-many. There are multiple sources of information that
each correctly specify the layout. Again, there is no need to go beyond the
given as indirect theorists maintain. In opposition to more orthodox Gib-
sonian views, however, there is work to be done selecting from or combining
the overly rich information. It is work, though, that Cutting does not wish to
have labelled “mental” or “cognitive.” But it is the need for this additional
work that makes perception directed and not direct.
Although this attempt to distinguish among theory types is more straight-
forward than that in terms of hidden premises, it too is problematic. Pro-
ponents of indirect models do frequently claim that perception involves
128 Inference
inferential processes and that these steps are necessary in order to go beyond
what is given. It is a mistake, nevertheless, to assume that this claim is equiv-
alent to or entails that the stimulus is insufficient to specify the layout. To see
this, consider the situation with so-called “taking-account” models of size,
shape, or brightness perception. These models are usually considered para-
digm cases of indirect perception. (See Epstein 1973; Rock 1983.) Yet in these
cases the information relied on can be sufficient for veridical perception.
For example, the taking-account-of-distance model of size perception de-
pends on the fact that the size of the retinal image varies with the distance of
the object from the observer. This relationship is specified by the formula: im-
age size = object size/object distance.3 Proponents of the model maintain that
size perception results from a calculation (or inference) according to the re-
ciprocal psychological formula: perceived size = image size × perceived dis-
tance. Information about image size and distance are assumed available in
the retinal image and from other cues, such as the convergence of our eyes
in fixating the object. On this model, perception of size is not “direct” in the
sense that it depends on the prior registration and taking-account of non-
size information. *[See chapter 2.]
At the same time, given image size and distance information the psycho-
logical equation provides for a unique veridical evaluation of size. So the re-
lationship between layout and information is one-one. What is more, this
information, like the higher-order ratios and invariant properties cited by di-
rect theorists, can be characterized in terms of causal or lawlike connections.
The relationships among convergence angles or distance and object size,
angle of regard, and image size are subsumable under optical laws. The situa-
tion is much the same with the taking-account models of shape and bright-
ness. For that matter, similar points about causal or lawlike connections could
be made with respect to various of the pictorial and kinesthetic cues ordinar-
ily associated with theories of indirect perception.4
“The” form of information
The difference between traditional indirect theories and Gibsonian-inspired

theories would seem to lie then not in the adequacy or lawlikeness of the in-
formation available but in the form the information takes. Gibsonian theo-
rists’ demand/assume that fully adequate information be encapsulated in
or identifiable with a single higher-order property of the stimulus array. Cut-
ting’s more expansive Gibsonian theory allows that there may be several such
invariants each of which completely specifies the very same environmental
property. The existence of one or multiple invariants would, of course, be cru-
cial to the debate over indirect perception if reliance on higher-order features
of the array, as opposed to lower-order features, implied that no processing
took place or that the processing that did occur was purely “non-mental” (e.g.
Runeson 1977). But neither of these claims follows.
Determining density gradients, cross ratios, horizon ratios, etc. (i.e. higher-
order stimulus properties) may require, and theorists like Cutting permit,
complex computations. (See also Sedgwick 1980.) Furthermore, the claim
that the stimulus information is sufficient or over-sufficient for determining
the layout does not show that perceivers need not “process” these richer
sources of data. Finally, Cutting and his co-workers are willing to describe this
processing in terms of inference, computation, and selection. But these are
just the sorts of notions many theorists claim mark out the domain of “cog-
nitive” processing. (See Ullman 1980.)
In contrast to indirect theorists, Cutting’s more orthodox Gibsonian critics
reject the directed model primarily on the grounds that the stimulus features
Cutting cites should not to be thought of as “information” in Gibson’s sense.
The real information in the stimulus is a still higher-order property shared
by all of the features Cutting isolates (Burton and Turvey 1990; Stroffregen
1990; Pittenger 1990; Cutting replies in 1991b.). By identifying the available
information with this single property, and not individually with Cutting’s
assortment of invariants, they are able to hold onto their claim of one-one
correspondence between the information and the layout.
This conception of “information” is supposed to be applicable even when
more than one perceptual system or modality is involved. For example, per-
ceiving time-to-contact of a projectile may depend on acoustical as well as op-
tical invariants, but the information for such perception is to be understood
as a single higher-order pattern of them both. In other cases it is held that the
“informational” invariant is not to be identified with any external stimulus
but with a single invariant stimulus to tissue or neural structures that lie be-
yond the initial receptors.
Now there are some serious difficulties involved in finding plausible singu-
lar stimulus properties of the kind required to accomplish such reductive
analyses. But this is not the central reason for questioning Cutting’s critics’
mandate for a unitary specification of the stimulus information. The major
130 Inference
problem with their proposal is the lack of solid argument or experimental

data showing why their particular conceptualization of matters is theoreti-
cally important, empirically significant, or otherwise better than Cutting’s.
Instead, this neo-Gibsonian position seems to be that if, as Cutting main-
tains, there is more than one invariant associated with seeing a particular as-
pect of the layout, perception will be elementaristic, combinatorial, and hence
indirect. To avoid this unacceptable result, they are intent on reducing or
redefining Cutting’s set of invariants to a single feature of the stimulus array.
Then Cutting’s mathematical and empirical findings will not conflict with
their claim of a one-one correspondence, and they can continue to claim that
perception is really direct—the direct result of such unique correspondences.
Put in these terms, though, the debate between directed versus direct ap-
proaches to perception is more terminological than substantive. Both sides
could accept the mathematical and empirical evidence cited. They would just
characterize the evidence differently. But merely shifting the application of
the term “information” from a set of invariants to a single stimulus feature
should not alter anything about the way the visual system is thought to func-
tion. Nor, by itself, can such a purely verbal shift serve as a basis for establish-
ing that the directed model, as opposed to the direct model, involves cognitive
rather than only non-cognitive doings.
Some more empirical considerations
A quite different kind of challenge to the directed model denies what until
now has been allowed, namely that the experimental findings actually show
that perception depends on combining redundant information in the way
the model proposes. The recent theories and work of Gilden (1991), Gilden
and Proffit (1989), Massaro (1987, 1988), Massaro and Cohen (1993), Runeson
(Runeson and Vedeler 1993) and Cutting (Cutting et al., 1992) all speak to as-
pects of this issue. Gilden claims that although the sorts of lawlike kinematic
information Cutting and other Gibsonians isolate is available, perceivers do
not use this data. They employ instead heuristics that rely on less systematic,
less dependable cues to the layout. Vision is more of a hit or miss operation
with the visual system taking advantage of whatever features of the situation
it assumes salient to the problem at hand. Gilden likens his view to Ramachad-
ran’s (1990) anti-Gibson, anti-Marr, “bag of tricks” approach to perception.
Massaro’s dispute with Cutting is different. He does not object to Cutting’s
claim that perceivers make use of overly rich geometrical or kinematic infor-
mation. Massaro mainly questions the details of Cutting’s additive model of

combining information, and especially its consequence that the contribu-
tion of one source of information is independent of the ambiguity of other
sources of information. Massaro favors instead a fuzzy logic model of percep-
tion. This model, he maintains, makes use of general decision theoretic algo-
rithms that are employed across modalities and domains and are not modular.
Runeson hews to a more strict Gibsonian line.5 He argues against “elemen-
taristic” theories, specifically Gilden’s, that rely on pieces of information that
do not uniquely specify the perceived event or property. He seeks, rather, to
explain “perceptual functioning and skill on a particular task in terms of a
single proximal informative property” (Runeson and Vedeler 1993, p. 624).
Runeson maintains that his opponents’ contrary claims result from faulty
conceptions of “stimulus information” and from experimental designs
that are not ecologically valid and so do not tap the complete invariant that
specifies the distal property. For Runeson, even when “performance differs
systematically from perfection, this is, . . . because the perceiver is relying
on an incomplete invariance” (p. 624). An incomplete, or local, invariant is
one that is a valid indicator of the distal property only under a restricted set
of conditions. In any case, Runeson concludes that since perception of an
event or property is best explained in terms of the pickup of a higher-order
invariant, inferential processes play no role and perception is, as Gibson
says, “direct.”
Evaluating the details of these conflicting models of perception is well be-
yond the scope of this paper. In the literature cited, the debates tend to be less
over the experimental data than over the implications of their proper analy-
ses. Criticisms and arguments revolve around such issues as: the kind of sta-
tistics employed, whether a given model is amenable to empirical testing, the
relevance of individual versus group data, the ecological validity or supposed
biases in the experimental set-up, and, most crucially, differing views about
the appropriate definition of “information.” This, of course, does not mean
that there are no empirically interesting issues separating these alternative
accounts of the mechanisms of perception—issues that further experimental
study should help to resolve. On the other hand, the combatants themselves
admit that often it is not easy to distinguish among the empirical predictions
the models make (see Hecht 1996).
There does seem, nonetheless, to be some consensus that were either Gilden
or Massaro correct, it would indicate that perception importantly involves
mental processing and is indirect. Gilden’s characterization of his model in
132 Inference
terms of decision theory and Massaro’s characterization of his in terms of

fuzzy logic, notions developed to describe intellectual activities, especially
human reasoning, make this seem apparent. But then Cutting’s talk of com-
putations and inference also finds its initial home in accounts of mental ac-
complishments. Moreover, as discussed above, from the standpoint of his
Gibsonian critics, the combinatorial features of even Cutting’s model are too
elementaristic and “cognitive” for their tastes. Making use of terminology
made prominent by Rosen (1978), the directed model is said to “fractionate”
perception along non-natural lines.
Verbal analogies and favored intuitions of the “mental” aside, Gilden’s and
Massaro’s theories no more than Cutting’s or Runeson’s assume that percep-
tion involves: conscious states of processing, introspective accessibility, ex-
plicit deliberation, penetration by propositional knowledge, or a host of
other properties associated with more cognitive, conceptual undertakings.
And none of these theorists suppose that perception involves seeing and in-
terpreting an inner picture or picture-like representation of the world. Gilden,
it is true, indicates that the heuristics perceivers employ probably result from
learning that certain features of the environment are salient. It is also true
that Cutting in places identifies “being in the mind” with being learned. But,
once again, the question of innate versus learned can and should be sepa-
rated from the question of which stimulus features actually play a role in
perception—Gilden’s, those cited by strict Gibsonians, Cutting’s, or other
theorists’ favored features. Nor does it seem appropriate at present to draw a
cognitive/non-cognitive distinction in terms of innateness.
Metaphysics
What remains especially puzzling in all this controversy, however, is the

added ontological significance everyone attaches to the various versions of
“information” and “stimulus adequacy” that go with these models. One of
the dominant motives underlying this debate over informational sufficiency
seems primarily metaphysical, and problematically so. Gibson thought that
his theory of direct perception exploded idealistic philosophical myths about
the inability of the senses to reveal the world as it truly is. The scientific study
of perception, as opposed to armchair speculation, provided conclusive rea-
sons for realism. On this there is widespread agreement among Gibson’s fol-
lowers. Many of these Gibsonians, though, think that in order to be a proper
realist it is necessary that there be a one-one correspondence between some

feature of the stimulus array and the layout. Cutting and his associates hold
that realism of an appropriate sort only requires that there be sufficient in-
formation in the array, more does not hurt. Both sides agree that if there were
not sufficient information to uniquely specify the layout, perception would
be indirect, and indirect perception is incompatible with their preferred
understanding of perceptual realism.
Yet it is hard to discern what these claims about realism ultimately amount
to and harder still to see how the cited mathematical and empirical findings
might resolve the issue. Whether the relationship between the layout and in-
formation is many-one, one-one, or one-many, as long as the resulting per-
ception is veridical it makes perfectly good sense to say that perceivers see
what is real or how things really are. The issue of realism can insert itself only
when theorists have in mind some special epistemological or metaphysical
notion of “see.” Then talk shifts to the nature and status of the intermediary
states or stages that occur in the causal chain from environment to percep-
tion. No visual theorist, though, seems willing simply to equate features of
the stimulus array, even higher-order ones, with the actual objects in the en-
vironment or their physical properties. Proposed realist theories of percep-
tion, like non-realist theories, do not assert a strict identity of any state or
stage in the causal chain with the actual physical object, property, or event
that initiates the chain. They assert a correlation. Certain properties of the
stimulus or light array correspond with certain features of the world.
Gibsonians, nevertheless, feel that unless there is a one-one, lawlike, corre-
lation between the environment and some single aspect of the stimulus array
perception would be indirect. We could then only properly be said to see signs
or representations of the world, not the world itself. Thus realism would be in
trouble. Cutting refuses to go along fully here. He holds that as long as there
is an invariant source of information in the stimulus that specifies the object,
property, or event we can be said to direct(ed)ly see the environment. The ex-
istence and use of multiple sources of such information does not drive a
wedge between the perceiver and reality. Indeed, it provides better access to
the way things are.
For Cutting too the assumption remains that if the stimulus underdeter-
mined the layout—if, that is, there were not at least some invariant stimulus
property in one-one correspondence with the object, property, or event—
there would be a problem. Then the perceiver is deprived of direct contact
134 Inference
with reality. This, though, is an assumption, and an assumption with little

more to latch onto for support than its proponents’ stipulations about what
sorts of intermediate states in the causal chain count as standing in the way of
perceivers seeing or being in touch with reality. There is, after all, no reason
why indirect theorists must be burdened with the claim that perceivers first
see, and only really see, cues and then in some less than adequate way come
to see, or infer, distal properties. At least, this is no more required of them
than Gibsonians are required to admit that we first really see the invariants
and only derivatively see/infer distal properties. Might then this particular is-
sue of visual realism dissolve if, for example, indirect theorists were simply
willing to say that perceivers pick-up cues or invariant stimulus features but
see distal properties? *[See chapter 15.]
Conclusion
In a well-known introduction to a volume dedicated to exploring the impor-

tance of Gibson’s work to psychology, Shaw and Bransford (1977) review in
detail competing claims about direct and indirect perception. They point out
that it is difficult to say anything definitive because the meanings of these
theses are not very clear. In spite of this, they go on to argue that all extant ver-
sions of the position that perception is indirect are either implausible or have
little content. So they conclude that perception is best seen as direct. Fodor
and Pylyshyn (1981) in their much-discussed critique of Gibson take the op-
posite tack. They claim their analyses show the idea that perception is direct
is either implausible or without content. They conclude that perception must
be indirect.
I think that both sides are onto something in their criticism of the opposi-
tion. The conclusions I draw, however, differ from them both. As I see it, the
supposedly important issue at stake is largely verbal, isolated from serious ex-
perimental and theoretical work on the functioning of the visual system. Un-
constrained in this way, the distinction the controversy feeds off has outlived
its usefulness and the debate it persists in provoking has become unproduc-
tive. Cutting’s proposal for a third alternative position does not improve the
status of things. This is not to fault Cutting’s mathematical/empirical find-
ings, which do make a real contribution to the study of vision. It is Cutting’s
further thesis that perception is directed, and hence neither indirect nor di-
rect, that only adds fuel to a fire better allowed to burn itself out.
Acknowledgments
I wish to thank James Cutting for discussing these issues with me. I also wish
to thank the journal referees, John Heil and Edward Reed, for their comments.
Notes
1. See Fodor and Pylyshyn (1981) for a widely-cited version of the more standard op-
posing view.
2. There are, of course, those, like Searle (1992), who insist on identifying the mental
with actual or potential conscious awareness. One of Searle’s major complaints with
current cognitive science is its failure to adopt this criterion of the mental.
3. Technically this equation holds strictly only for cases where the object lies on a
plane perpendicular to the perceiver’s line of sight. There is no need to go into these and
other complications here.
4. Veridical perception of metric, or what are sometimes called “absolute” spatial prop-
erties would depend on assumptions of a scaling factor. But this is also true when the
information relied on are higher-order Gibsonian stimulus properties.
5. Runeson’s views are even more closely associated with those of G. Johansson. The
differences between Gibson and Johansson need not concern us here. See Runeson
(1977) and Gibson (1977).
References
Bruno, N. and J. Cutting, (1988). Minimodularity and the perception of layout. Journal
of Experimental Psychology: General 117, 161–170.
Burton, G. and M. T. Turvey, (1990). Perceiving the length of rods that are held but not
wielded. Ecological Psychology 2, 295–324.
Cutting, J. E. (1986). Perception with an eye for motion. Cambridge, MA: MIT Press.
———. (1991a). Why our stimuli look as they do. In G. R. Lockhead and J. R. Pomerantz
(eds), The perception of structure (pp. 41–52). Washington, DC: American Psychological
Association.
———. (1991b). Four ways to reject directed perception. Ecological Psychology 3, 25–34.
———. (1993). Perceptual artifacts and phenomena: Gibson’s role in the 20th century.
In S. C. Masin (ed.), Foundations of perceptual theory (pp. 231–260). New York: Elsevier
Science.
136 Inference
Cutting, J. E., N. Bruno, N. P. Brady and C. Moore, (1992). Selectivity, scope and sim-
plicity of models: a lesson from fitting judgements of perceived depth. Journal of Exper-
imental Psychology: General 121, 364–381.
Epstein, W. (1973). The process of ‘taking-into-account’ in visual perception. Perception

2, 267–285.
Fodor, J. and Z. Pylyshyn, (1981). How direct is perception? Some reflections on Gib-
son’s ‘Ecological Approach.’ Cognition 9, 139–96.
Gibson, J. J. (1950). The perception of the visual world. Boston: Houghton Mifflin.
———. (1977). On the analysis of change in the optic array. Scandinavian Journal of Psy-
chology 18, 161–163.
Gilden, D. L. (1991). On origins of dynamical awareness. Psychological Review 98,

554–568.
Gilden, D. L. and D. R. Proffitt, (1989). Understanding collision dynamics. Journal of

Experimental Psychology: Human Perception and Performance 15, 372–383.
Hecht, H. (1996). Heuristics and invariants in dynamic event perception. Immunized

Concepts or non-statements? Psychonomic Bulletin and Review 3, 61–70.
Helmholtz, H. (1968). The origin of the correct interpretation of our sensory impres-
sions. In R. Warren and R. Warren (eds.), Helmoltz on perception: its physiology and devel-
opment (pp. 249–260). New York: Wiley.
Marr, D. (1982). Vision. San Francisco: W.H. Freeman.
Massaro, D. W. (1987). Speech perception by ear and eye: a paradigm for psychological in-
quiry. Hillsdale, NJ: Erlbaum.
———. (1988). Ambiguity in perception and experimentation. Journal of Experimental

Psychology: General 117, 417–421.
Massaro, D. W. and M. M. Cohen, (1993). The paradigm and the fuzzy logical model of
perception are alive and well. Journal of Experimental Psychology: General 122, 115–124.
Pittenger, J. B. (1990). The demise of the good old days: Consequences of Stroffregen’s
concept of information. ISEP Newsletter 4, 8–10.
Ramachadran, V. S. (1990). Visual perception in people and machines. In A. Blake and

T. Troscianko (eds), AI and the eye (pp. 21–77). New York: Wiley.
Rock, I. (1983). The logic of perception. Cambridge, MA: MIT Press.
Rosen, R. (1978). Fundamentals of measurement and representation of natural systems.

New York: Elsevier North-Holland.
Runeson, S. (1977). On the possibility of “smart” perceptual mechanisms. Scandinavian

Journal of Psychology 18, 172–179.
Runeson, S. and Vedeler, D. (1993). The indispensability of precollision kinematics in

the visual perception of relative mass. Perception & Psychophysics 53, 617–632.
Sedgwick, H. A. (1980). The geometry of spatial layout in pictorial representation. In

M. Hagen (ed.), The perception of pictures. New York: Academic Press, 33–90.
Schwartz, R. (1994). Vision: variations on some Berkeleian themes. Oxford: Blackwell.
Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press.
Shaw, R. and Bransford, J. (1977). Introduction: psychological approaches to the prob-

lem of knowledge. In R. Shaw and J. Bransford (eds), Perceiving acting, and knowing.
Hillsdale, NJ: Erlbaum, 1–39.
Stroffregen, T. (1990). Multiple sources of information: for what?, ISEP Newsletter 4, 5–8.
Ullman, S. (1980). Against direct perception. Behavioral and Brain Sciences 3, 373–415.
III Picture Perception
Prescript 9
Chapter 9 brings together a variety of criticisms of resemblance analyses of

pictorial representation. It reaffirms the claim that resemblance is neither
necessary nor sufficient to ground the semantic/referential functions of pic-
tures and explains why a symbolic analysis can deal better with these matters.
Moreover, given that any two things can legitimately be said to resemble each
other in some respect, a flat-footed appeal to resemblance offers little theo-
retical insight.
Still, people remain convinced that considerations concerning the acquisi-
tion and function of pictorial skills show resemblance theories must be on
the right track. The symbolic approach, it is maintained, ignores or fails to
explain these facts. Chapter 9 responds to such qualms and convictions. It at-
tempts to soften the impact of the criticism as well as fit underlying resem-
blance intuitions within the symbolic paradigm.
9 Representation and Resemblance*
An old and ingrained tradition has it that what makes a picture a representa-
tion is resemblance between the picture and what it represents. A picture of
Nelson Rockefeller represents Rockefeller and not John Lindsay because it
resembles the former and not the latter. The trouble with this traditional
view is that it is difficult to interpret it in a way that makes it both true and
informative.
Obviously, resemblance is not a sufficient condition for representation. Two
pictures of Rockefeller may resemble each other more than they resemble
Rockefeller, yet it’s the man they represent. Similarly, one of Rockefeller’s
brothers may look more like him than any portrait does, but his brother doesn’t
represent him. Representation requires that one object refer to (stand for,
be about, be a symbol for) the other, and this “semantic” relationship is not
guaranteed by resemblance.1
If resemblance is not a sufficient condition for representation, still the idea
lingers that it must be necessary. For isn’t it resemblance that distinguishes
pictorial reference from mere denotative reference? Isn’t what distinguishes a
picture of Rockefeller from the name, “Rockefeller,” or the description, “the
governor of New York in 1972,” the fact that only the first symbol resembles
him? The view that resemblance, while not sufficient, is a necessary condi-
tion for a picture to represent does have its appeal, but it also has its short-
comings. The problem is that in any of its more interesting applications the
resemblance relation marks no simple or fixed relationship among objects.
X may resemble Y with respect to property P1 and not property P2 and Z with
respect to P2 and not P1. And no advance is made in claiming that two things
resemble each other, if or to the degrees that they share properties, since any
two things have the same number of properties in common. Attempts to give
independent criteria for resemblance in terms of geometrical or topological
144 Picture Perception
properties, or in terms of such notions as “imitation,” “true copy,” “likeness,”

etc., have also met with little success. There would seem to be no general way
to specify the relation without appeal to people’s actual resemblance judg-
ments. However, these judgments are almost always relative to context and
background knowledge, as well as to the purposes and categories of compar-
ison that gain prominence. Further, the skills, interests, and needs of the per-
son making the comparison all have an effect on his resemblance judgments.
The experienced eye sees similarities and dissimilarities where the novice
sees none. When in particular the question is resemblance between picture
and object, familiarity with the style of representation and knowledge of
other modes of picturing seem to be additional variables. Indeed, it is difficult
to see how to separate judgments of resemblance between picture and object
from judgments that the picture adequately represents the object. And if this
distinction cannot be drawn, we have come full circle.
While it would be impossible here to detail all the many other pitfalls in-
volved in making representation depend on resemblance, suffice it to say that
in order for resemblance to play a significant role in distinguishing pictures
from other symbols, we must be able to give independent empirical content
to the claim that pictures resemble what they represent. But it’s just at this
point that the traditional view usually bogs down. Resemblance simpliciter
would seem to make little sense empirically, except possibly in cases where
the picture and object cannot be readily told apart, and this surely is not the
case between most pictures and what they represent. M. Black sums up mat-
ters nicely when he writes “My chief objection to the resemblance view, then,
is that when pursued it turns out to be uninformative, offering a trivial verbal
substitution in place of insight. . . . The objection to saying that some paint-
ings resemble their subjects is not that they don’t, but rather that so little is
said when only this has been said.”2
Now making the theoretical distinction between representational and non-
representational symbol systems is no direct part of the purpose of this paper.3
My concerns are different. What I want to do is examine some of the reasons
psychologists and others concerned with accounting for the acquisition and
use of symbolic skills remain so reluctant to abandon a resemblance ap-
proach to pictorial representation. Then I hope to show why these reasons are
not sufficient to justify clinging to the traditional view.
Much of the resistance to giving up the traditional approach can be traced,
I think, to the following dilemma: If pictures do not resemble what they rep-
Representation and Resemblance 145
resent, it is thought that the relationship between pictures and their referents
must be arbitrary, like that between words and their denotata. That “cat” de-
notes cats is an arbitrary decision, and the language would not in any way be
seriously altered if “cat” were used to denote tables and “table” to denote cats.
Since what each word denotes is a matter of convention, we must learn each
individually. Presented with some new word, we will not know what it de-
notes unless we are taught its use. But surely, it is felt, such arbitrariness is not
a feature of pictorial systems. We couldn’t just as well decide to let a picture of
Rockefeller denote Lindsay without seriously altering the kind of symbol
system at hand. Furthermore, we needn’t be taught what each new picture
represents as we must have explained to us what each new word means.
Therefore, the relationship between pictures and their referents could not be
conventional, like that between words and their denotata. The referential or
descriptive significance of pictures must after all be due to resemblance.
But then theorizing about pictorial representation is stalemated. The psy-
chologist feels that unless he appeals to resemblance, certain psychologically
important distinctions between pictures and words are obscured. Yet, the no-
tion of resemblance is itself so problematic, that it cannot serve to get an ad-
equate explanation off the ground. The situation calls for a re-examination.
What is needed is a way to relieve the pressure of the dilemma that does not
itself require an uncritical appeal to resemblance.
As I have sketched it, the dilemma is based on two assumptions. The first is
that if pictures do not resemble their referents, then the connection between
the two must be arbitrary, in the way the connection between a word and its
denotation is. The second assumption is that the attribution of arbitrariness
conflicts with the fact that we can understand new pictures and not new
words. But it takes little examination of other types of symbol systems, and
how we might go about mastering them, to see that the assumptions under-
lying the dilemma are unfounded. For consider a system like standard West-
ern music notation. Given only a suitable sampling of written notes (symbols)
and taught to correlate them with sounds (referents), we might very well
learn how the system works, how to go on. Getting the idea of how the sys-
tem works enables us to handle new symbols in the system not included
among the teaching samples. I am not talking here about new combinations
of previously learned notes, but of understanding new, hitherto unheard in-
dividual notes. And such learning can occur, it would seem, without our ever
receiving explicit instruction concerning the structure of music notation.
Yet, there is no reason to suppose that the written notes look like or resemble
the sounds they denote. Or, similarly, consider a gauge that correlates bright-
ness of display light (symbol) with temperature of object (referent). Presented
with enough instances of these correlations, we may learn how the system
works. And once we know how the system works, we can interpret an un-
bounded set of new symbols. Again, resemblance between symbol and refer-
ent would seem to play little role. *[Inductive learnability is comparable to
the notion of “systematicity” as it is discussed in theories of language and
thought. The claim that inductive learnability implies compositionality is
much less plausible in the case of pictorial representation.]
Indeed, we can find examples of this sort of inductive semantic learning in
natural language, too. Indicator terms, metaphor, and number vocabulary
provide three different areas where a relationship exists among the symbols
so that learning the reference of some words enables us to project the seman-
tics of the others correctly. Although tokens of the indicator word “here” dif-
fer vastly in their denotata, we learn to understand new tokens on the basis of
our experience with the old. The same is true of our ability to understand
brand new metaphors. Our habits associated with the literal use of the word
put sufficient constraints on metaphorical use, so that we can frequently in-
tuit the semantic import of the metaphor the first time around, without being
taught it specificially. Finally, it would seem that ordinary number vocabu-
laries also have this learnability feature. We might learn to use the cardinal
numbers properly by being given enough examples until we get the idea of
how numerals are concatenated so as to measure the cardinality of a set.
In none of these cases does systematic correlation of the set of symbols with
their referents depend in any obvious way on resemblance. Nor does it de-
pend on being able to define or specify the semantics for the new symbols
within the resources of that part of the system already mastered, or, for that
matter, within the resources of the entire system. And if lack of resemblance
entails that the system is conventional, then all these systems are conven-
tional. Still, the symbols within a given system may not be arbitrary with
respect to the other symbols in the system, for, there may be sufficient regu-
larity among the symbols, regularity in how they denote or describe, so that
learning to use some provides adequate evidence for interpreting other mem-
bers of the set. The difference between the set of words “cat,” “table,” “ink,”
etc., and “1,” “2,” “3,” . . . “10,” “11,” “12,” etc., or music notation is not that
members of the first set fail to resemble their referents, while members of
the latter two sets resemble theirs. Nor is the difference that the first set is
conventional and the other two are not. Rather the difference is that “cat,”
“table,” “ink,” etc., are arbitrary relative to each other, while with the number
vocabulary or music notation there is some systematic regularity among the
symbols affecting the way their interpretation are assigned. This regularity,
of course, is not a priori or non-conventional. “21” could have been used to
denote 99 membered sets rather than 21 membered sets and “ ” might
have been chosen to denote C # rather than G. The point is that, given the way
the system does work, with the correlations that have been established and
do exist, we can learn the semantic force of some members of the system from
learning the semantics for others. Arbitrariness is not a question of conven-
tionality, but more a question of induction and learning. We see the assign-
ment of symbol to referent as arbitrary when we can discover no pattern that
enables us to project the semantic import of the symbol from knowledge of
other symbols in the system.
A symbol can be arbitrary then in the sense that it is a matter of convention
or choice or not a priori that it denotes what it does, but this differs from say-
ing that its interpretation is arbitrary with respect to the other symbols in the
system. It does not in any way follow that if symbols do not resemble their ref-
erents, the symbols need be arbitrary with respect to each other in the way
“cat,” “shoe,” and “ink” are. That we can understand what a new picture rep-
resents, therefore, does not entail that the picture bears some absolute or fixed
resemblance relationship to what it represents. All that is required is that there
be a discernable pattern of usage within the pictorial system, so that learning
what some pictures in the system represent provides the appropriate experi-
ence for learning what new pictures in the system represent. If this is so,
much of the pressure forcing us back to the traditional view is relieved.4
Another obstacle remains, however, to thwart attempts at overturning the
traditional view. Our account of the ability to understand new symbols sug-
gests that we learn directly some correlations of symbol to object, and this
enables us to know how to deal with other symbols whose semantics have not
been directly given. But many theorists maintain that the ability to compre-
hend pictures requires no learning, at least not any that can be viewed as in-
struction or practice in interpreting pictures. So, it is thought an important
psychological difference remains between these “learnable” systems and pic-
torial systems. Pictorial systems require no learning, and the only way to ex-
plain this is to allow that pictorial systems are based on resemblance. This
push toward the traditional view has force, however, only if we grant both
that we do not have to learn how to understand pictures, and that resem-
blance could provide an account of this fact.
But, theoretical considerations cast doubt on the initial no-learning claim,
as well as on the idea that resemblance, reasonably construed, could explain
it. For earlier, we noted that resemblance is not a sufficient condition for
representation. So, even if resemblances were not relative to skills, interests,
theory, perceptual abilities, etc. and discerning resemblances required no
learning, some instruction would be needed to determine when and how
things function as representations—for example, that Rockefeller’s picture
under normal circumstances represents him, and that his brother does not
represent him. Even if we discount this problem of how we acquire the ability
to attach symbolic significance at all to pictures, other features of the situation
make it very unlikely that we can completely rule out some form of symbol
learning. Perhaps the simplest feature we could point to is that while pictures
in standard Western pictorial systems are by and large two-dimensional, we
interpret their referents most usually as three-dimensional objects. So al-
though a picture of Rockefeller will resemble his frontal surface at least as much
as it resembles him in entirety, it is a representation of a three-dimensional
man and not a picture of a cross-sectioned man. Similarly, a profile picture of
Rockefeller will show but one eye, yet it does not represent him as half-headed
or one-eyed.5 However, if untutored resemblance is all we had to go on, it
would seem that the profile will resemble a half-headed being seen from the
side just as much as it does the full-blown Rockefeller seen from the same po-
sition. And it is difficult to see how our adjusting to these features of standard
pictorial representation could be accomplished without some sort of learning.
Examination of the empirical evidence available does not force the no-
learning claim upon us either. Indeed, most of the data concerning this issue
is anecdotal and highly equivocal. On the one hand, there is some anthropo-
logical evidence that people belonging to tribes unfamiliar with Western rep-
resentations do not understand photographs when first presented with them,
and experiments by Hudson and more recently Deregowski seem to indicate
that people inexperienced with Western art are initially confused about depth
relationships characterized by drawings in standard perspective.6 On the
other hand, there are some reports of immediate recognition of photographs,
and there is at least one experiment indicating that an untutored child can un-
derstand pictures the first time around.7 In this latter case, the experimenters
did not allow their child to be given instruction in pictorial interpretation. At

the age of 19 months, they showed him line drawings and photographs and
it is reported that he was able to classify the pictures as car-pictures, shoe-
pictures, etc. with a significant degree of accuracy. While this experiment and
the other anecdotal reports of immediate comprehension are of considerable
interest, it is most difficult to tell just what the evidence proves with respect
to the no-learning claim. For, although the child, or adult, for that matter,
may not have been given any specific instruction concerning pictorial inter-
pretation, nothing is done to prevent him from transferring significant por-
tions of his interpretive skill from his experience with other non-linguistic
symbol systems, such as gestures, imitation, imagery, sensorimotor or enac-
tive schemes, etc. that many psychologists tell us play an important role in
normal cognitive development.8 So although no explicit instruction in inter-
preting pictures may have been given, this would not show that no relevant
learning processes were involved. *[What is more, innate correlations imply
neither similarity nor resemblance. See chapters 5 and 10.]
The no-learning claim becomes even more problematic if we consider the
possible facilitating effects of experience gained in perceiving images in mir-
rors, distorted or otherwise, in viewing shadows, in noticing outlines im-
pressed in sand, etc. It seems plausible, at least, that such encounters too may
play a role in developing our skill at pictorial interpretation. In these cases,
the two-dimensional displays are not usually seen as parts of man-made sym-
bol systems, rather they are likely to be experienced as signs of or cues to the
object. However, if we are willing to allow that learning to perceive the signifi-
cance of signs and cues might play a facilitating role in acquiring pictorial
skill, then the opportunities for transfer learning are greatly extended. In-
deed, it would be hard to separate these learning experiences from those in-
volved in the overall acquisition of perceptual skill. For one need not adopt
the Berkeleyian view that all perception is a matter of sign interpretation to
admit that much of what we call perceptual learning is the acquisition of skill
in interpreting the significance of clues, symptoms, traces, signals, and cues.
Undoubtedly, it will be argued that in suggesting an account of pictorial
skill that depends on the notion of transfer, resemblance has been smuggled
in the back door; for there could only be transfer where there is resemblance.
This claim, however, seems neither very helpful or insightful. Surely there
is no one fixed specification of resemblance that runs through and could ac-
count for all cases of perceptual transfer. Nor is it obvious that there is any
clear sense in which it can be said that all the clues and cues themselves re-
semble their objects. Many of the arguments outlined at the beginning of this
paper would seem to apply equally well to claims that shadows resemble their
objects or that the foot under the cover resembles the pattern of blanket folds
that indicate its presence.
Of course, to suggest the importance of transfer learning is not to pro-
vide argument or evidence for it as an account of pictorial skill. However, my
point is that if effects of transfer are considered, the significance of evidence
brought forth to support the no-learning claim is further obscured. And if the
no-learning claim is weakened, one more pull toward the traditional resem-
blance account of pictorial skill is also weakened.9
In challenging the fruitfulness of resemblance theories, I have not attempted
to offer an alternative account of pictorial competence. Nonetheless, if the ar-
guments presented above are correct, a somewhat different emphasis in ap-
proach would seem indicated. Instead of our concentrating exclusively on
the relationship between picture and object, more attention should be paid
to the relationship among symbols within the given system, to see how and if
learning some of the symbols plays a role in enabling us to comprehend the
significance of other new symbols in the system. Similarly, we might explore
how competence in one style of pictorial representation influences or pro-
vides the basis for understanding another style. For example, in what way,
if any, does understanding caricature depend on mastering normal pictorial
systems? More stress too should be placed on discovering the possible facili-
tating effects of skills and principles developed in our use of other non-
linguistic symbol systems such as gestures, imitations, imagery, sensori-motor
or enactive schemes, etc. Would damage to or inability to master these systems
be reflected in difficulty with pictorial systems? And, perhaps most impor-
tantly, we should look for ways in which particular pictorial systems may take
advantage of our ordinary habits of perception, cue detection, pattern recog-
nition, etc. How, for example, may our normal skills at distinguishing figure
and ground be used to parcel out portions of a picture into figure and ground?10
Perhaps deeper understanding of these issues will, in turn, shed light on
the perennial puzzle of realism in art. What is it that makes a picture realistic?
One argument has been that realism is to be accounted for in terms of the
identity of the bundle of light rays reflected from a realistic picture and those
rays reflected from the object it represents. Now, no one need deny the optics
of the situation—that some pictures viewed under certain very stringent
conditions will reflect the same bundle of light rays as their objects viewed
under specified conditions. However, as Goodman, Pirenne, and others have
noted, the identity of light rays thesis can have little to do with ordinary pic-
ture perception. For the identity position requires that we view the picture
and object one-eyed, through a peephole, with the eye stationary, and these
surely are not the usual conditions under which we look at pictures and make
judgements about their realism.
An alternative account, put forth by Goodman in Languages of Art, is that
once we give up the idea that resemblance is a necessary or sufficient condi-
tion for representation, we can come to see that realism is more a matter of
habituation and familiarity. “Realism is relative, determined by the system
of representation standard for a given culture or person at a given time.”11 On
this account, realism is a matter of ease of interpretation. What makes a Rem-
brandt portrait more realistic than a Picasso Cubist painting is that the Rem-
brandt is in a system whose principles of interpretation are ingrained, the
principles are second nature. But in order to interpret the Picasso, “we have to
discover rules of interpretation and apply them deliberately.”12 It is most fre-
quently felt, however, that this analysis of realism distorts certain important
features of perception. For it is claimed that no matter how familiar we are
with the particular Picasso painting, or how second nature interpreting cubist
pictures becomes, such pictures will not seem realistic (or at least nowhere
near as realistic as a Rembrandt). Our judgments of realism are just not as flex-
ible as the familiarity view would appear to require.
Now I believe that there is something to this criticism of the familiarity ap-
proach to the problem of realism, but that a consideration of some of our
points about learning may supplement the position and make it more palat-
able. This supplementation, however, is not intended to provide a definition
of realism. Nor is it meant to provide criteria for making fine distinctions
among pictorial styles or for constructing a precise ordering of degrees of re-
alism. The rough principles to be offered are perhaps necessary conditions for
realism but are clearly not sufficient. They are suggested only as a way to over-
come the “anything goes” conclusion—the claim that with familiarity any
picture could be as realistic a picture of X as any other—that is seen to follow
from a pure familiarity account.
I would suggest that one characteristic of systems of representation usually
taken as standards of realism is that they are inductively learnable or more
easily so than other systems. Having been taught to interpret several cubist
pictures, we are less able to project to the correct interpretation of new cubist
pictures than we are if given examples of impressionist paintings, and then
required to interpret a new impressionist picture. With very abstract styles
such projection would be even harder than the cubist case, whereas the tran-
sition from one photo-realist painting to another might be even easier than
in the impressionist case. So, among pictorial systems, degree or ease of learn-
ability may correlate with our intuitions of realism. While related, learnabil-
ity, in our sense, may be separated from ease of interpretation. For example,
the set of numerals 1–1000 may be more learnable than a set of one thousand
arbitrary words like “cat,” “ink,” “table,” etc., although, once having mastered
both sets, it is as easy to understand or interpret “ink” as it is the number “97.”13
It seems plausible that another characteristic of realistic systems of picto-
rial representation is that they make better use of habits and processes of per-
ception that we have developed for dealing with ordinary objects. Thus, as
indicated above, Hochberg has been examining the possible relationships be-
tween the processes involved in scanning edges and those involved in per-
ceiving realistic line drawings. Similarly, certain means of rendering distance
on a two-dimensional surface may readily tap perceptual processes under-
lying ordinary distance perception. For example, it is known that superposi-
tion or overlapping serves as a cue to distance; when one object hides another
the object hidden is judged to be further away. A system of representation that
likewise hides or blocks out the more distant object might thus be able to make
use of one of our well-ingrained habits of three-dimensional distance percep-
tion. *[But see chapter 7.] Modes of representing brightness are another case
in point. It is well known that it is impossible to have the absolute brightness
of a picture viewed under gallery conditions equal that of, say, the sunny field
of which it is a study. But it has also been established that brightness percep-
tion is affected by other stimuli and cues than absolute brightness. In partic-
ular, the ratio of the object’s brightness to that of other nearby objects seems
to have an overwhelming effect. Representational systems that take into ac-
count the importance of relative brightness to brightness constancy might
thus be better able to exploit our existing visual habits and skills than just any
old system of correlating pigment with brightness. And while there is noth-
ing in principle to preclude a system of representation in which a color repre-
sents its complementary or in which a color is correlated with size, such
systems need not significantly tap the processes of cue detection, scanning,
constancies, etc., that we employ in determining the color or size of the ob-
jects we observe around us.
Notice, however, that to argue for such transfer of skills is not to return to
the identity of light rays thesis; nor is it to claim that there are no differences
between the processes involved in perceiving objects and those needed to in-
terpret pictures. All that is required is that certain two-dimensional systems
of cues and ways of rendering space, shape, color, size, and light take better
advantage of our ordinary perceptual skills than other systems. If this is so,
then given the processes by which we do see objects in the world, systems that
can tap these existing skills and habits will be considered relatively realistic.
Those systems that require new and separate skills of interpretation, where
there is little transfer from ordinary perception, or where there is interference
with these habits, will be considered less realistic.
These suggestions are not meant so much as a challenge to the familiarity
account of realism as they are a supplementation. The learnability and trans-
fer features could be offered as partial explanations why interpreting some
systems seems second nature, and why in dealing with other systems we have
to apply rules of interpretation more deliberately. Also, this supplementation
would provide some basis for explaining why our judgments of realism are
not as flexible as a pure familiarity, “anything goes,” view might require. For
no matter how familiar or at ease we are with a particular picture or system,
its principles of interpretation may be at odds with our normal processes of
object perception. To the degree that this is so, we will not find pictures in the
system realistic. It should be noted, however, that we do not really know
how physiologically fixed or flexible all these perceptual processes are them-
selves.14 Nor do we know if or to what extent experience looking at pictures
may influence our more usual processes of object perception. *[See R. Schwartz;
“The Power of Pictures,” Journal of Philosophy, LXXII, (1985), 711–20.] And, of
course, the more relative and flexible our visual system is, the more relative
and flexible will be our standards of realism.
Perhaps, the essential difference between the pure familiarity view and my
supplementation is best seen as one of emphasis. The familiarity advocate, in
his account of realism, stresses the importance of our experience with the
most common or prevalent kinds of representations around us. The habits of
perception acquired in learning to comprehend these systems set the standard
for realism. The more a system requires new skills of perception and interpre-
tation that differ from or interfere with the processes underlying our ability
to comprehend familiar systems of representation, the less realistic it will be
judged. On my account, the emphasis is shifted. Throughout the day most
of us spend our time viewing not pictures, but a world of three-dimensional
objects. My suggestion is that the habits, processes, and skills underlying our
perception of these more ordinary objects serve as a touchstone for assessing
realism in pictures. The deliberateness, lack of second-nature, etc. associated
with non-realistic systems may be traced, in part, to the fact that they require
skills of interpretation differing from those involved in the use of our visual
system to perceive our everyday environment.
Finally, the tentativeness of all these suggestions about learning, transfer,
interference, etc. must be stressed again. Just how ordinary perceptual expe-
rience might facilitate pictorial understanding, which sorts of systems might
be aided and which hindered, why some tribes unfamiliar with Western rep-
resentation seem to have initial difficulty with photographs and drawings in
standard perspective are only some of the open questions requiring system-
atic study and experimentation of the sort not presently available.
Notes
* A version of this paper was read at the University of Pennsylvania; Annette Barnes
commented on my talk, and I benefited much from her remarks. I should also like to
thank Margaret Atherton, Joan Ganz, and Nelson Goodman for their comments.
1. For more on this issue see N. Goodman, Language of Art (Indianapolis, Bobbs Merrill
Co., 1968), Chap. 1 and M. Black “How Do Pictures Represent?,” in Art, Perception, and
Reality, ed. M. Mandelbaum (Baltimore, Johns Hopkins University Press, 1972). An ad-
equate account of pictorial reference, however, is not at hand, and any such treatment
would be much more complicated than this paper might seem to indicate. While I rec-
ognize that some of my remarks (e.g. about the reference of portraits) need patching up
to avoid error, I believe my main psychological points can be made without a more
subtle and refined treatment of these matters. *[The issues parallel those in the philos-
ophy of language concerning the relationship of names to descriptions.]
2. “How Do Pictures Represent?,” p. 122. Thorough and to my mind convincing argu-

ment concerning these problems can be found in E. H. Gombrich’s Art and Illusion (New
York, Pantheon Books, 1960) and in various of his other writings. Also see S. Hampshire,
Thought and Action (New York, Viking Press, 1960), Chap. 1, N. Goodman, Languages of
Art, Chap. 1, and M. Black, “How Do Pictures Represent?.”
3. For a discussion of this issue see N. Goodman, Languages of Art, pp. 225–232.
4. The distinction between systems having patterns in their interpretive schemes that
allow for inductive learning and those that do not may itself be a relative matter de-
pending on what other skills, discriminative powers, categories of classification, and
symbolic competencies are available. So learnability too may be more a matter of de-
gree than a fixed property of systems. In any case, it should be obvious that the distinc-
tion between “learnable” and “arbitrary” systems I have been proposing is not meant
to distinguish pictorial from non-pictorial symbol systems. Music notation and num-
ber vocabularies, I have suggested, both have this learnability feature, and, I take it, nei-
ther are representational systems.
5. For further consideration of this issue see P. Ziff, “On What a Painting Represents,”
Journal of Philosophy, 1960, Vol. 57, pp. 647–654.
6. See: J. B. Deregowski, “Pictorial Perception and Culture,” in Scientific American, Nov.

1972, pp. 82–88.
7. J. Hochberg and V. Brooks, “Pictorial Recognition as an Unlearned Ability: A

study of One Child’s Performance,” American Journal of Psychology, 1962, Vol. 75,
pp. 624–628.
8. See, for example, J. Piaget, The Origins of Intelligence in Children, (New York, Interna-
tional Universities Press, 1952) and numerous other of his publications; J. Bruner et al.
Studies in Cognitive Growth (New York, John Wiley and Sons, 1966).
9. Hochberg and Brooks themselves adopt a similarly cautious view toward their data.
For example, they suggest that part of pictorial competence may develop as a result of
the more general process of learning to perceive space.
10. See pp. 69–73 of J. Hochberg’s recent paper, “The Representation of Things and
People,” in Art, Perception, and Reality, where he speculates about how experience with
the world of objects, particularly the scanning of edges, might provide occasion for
developing skills appropriate for dealing with line drawings.
11. Languages of Art, p. 37.
12. Languages of Art, p. 36.
13. Again, I am claiming that a comparatively high degree of learnability may be nec-
essary for the realism of systems, I am not maintaining that it is sufficient or that other
characteristics may not weigh more heavily.
14. For example, the extent to which various constancies are physiologically deter-
mined as opposed to being learned or the extent to which they might be changeable
once an initial learning period has taken place are not settled matters.
Prescript 10
Challenged to say something substantive about the resemblance relation pre-

sumed to underpin pictorial representation, many theorists seek an answer in
optics. They note that a realistic picture, when suitably positioned and ob-
served, will project the same bundle of light rays to the eye as the scene de-
picted. This identity of projected light is thought to explain the perceived
resemblance between pictures and the scenes depicted. Such optical likeness
also serves to link picture perception to pictorial representation. Adoption of the
projection paradigm thus structures the problems of picture perception re-
search and the way they are studied empirically. By contrast, it is argued, the
symbolic paradigm takes the picture/depicted relationship to be arbitrary
(chapter 9) and has no visual footing on which to rest its account.
These supposed advantages of the projection paradigm are bought at a
steep cost. The identity of light rays thesis, at its best, can only accommodate
a very small range of pictorial styles. And it can only be applied to these good
cases, under viewing conditions that are hardly met in everyday visual en-
counters with pictures. Chapter 10 argues that adoption of the symbolic par-
adigm allows escaping the otherwise narrow scope and limited domain the
projection paradigm imposes. At the same time, the symbolic paradigm does
not generate many of the puzzles the projectionist approach must face. It sug-
gests ways to look at these issues in a more fruitful manner.
10 Pictures, Puzzles, and Paradigms
Introduction*
When psychologists who study vision turn their attention to picture percep-
tion, they find themselves entangled in a web of puzzles. There is, moreover,
no consensus and much confusion on how to resolve these matters experi-
mentally. As a result, research on picture perception is in an uneasy state.
When these same vision theorists turn their attention to Nelson Goodman’s
(1968) work on pictorial representation, they are highly critical. They are con-
vinced his ideas are at odds with well-established facts. I think there is a con-
nection between these two phenomena.
In brief, I believe Goodman and the vision theorists adopt strikingly differ-
ent paradigms concerning the nature of pictorial understanding. Their dis-
agreements, in the end, are less over the empirical data and more over the
appropriate interpretation of the facts. At the same time, I believe the para-
digm vision theorists do adopt is responsible for many of the puzzles they
encounter. In what follows, I will use “symbolic paradigm” to refer to the ap-
proach of Goodman and his followers, and “projective paradigm” will serve
to label the dominant paradigm of perceptual psychologists.
Grouping vision theorists in this way all under one rubric is, of course, a
simplification. There are dissenters in the field who favor the symbolic model
and other researchers who find neither model acceptable. In addition, there
are significant differences among projectivists in the accounts of picture per-
ception they champion. I think, however, these latter differences are mainly
due to differences in their models of perception in general. The differences do
not indicate rejection of the projective paradigm’s core conception of the na-
ture of picture perception.
The Projective Paradigm
The basic idea of the projective paradigm is that seeing pictures involves the
same psychological processes and mechanisms as seeing anything else in the
world. In a sense this claim is trivial, since pictures are themselves physical
objects in the world. The central projectivist claim goes further. Projectivists
maintain that in an important psychological sense, seeing a representation of
an object is like seeing the object itself.
Now in the case of seeing objects in the environment, the problem of per-
ception may and is often conceived as being one of “inverse optics.” Optics
determines the projection of light rays from objects to the retina. In order to
perceive the layout correctly, the perceiver must reverse the process. The per-
ceiver somehow projects back from the retinal image, or the information con-
tained therein, to the object from whence it came.
Vision theorists differ widely on how to explain this process. There is no
agreement on the proper description of the stimuli, on the information avail-
able in the retinal image, on whether or what calculations are involved in re-
covering the scene from the image, and on much else. These are the sorts of
differences, alluded to above, separating theorists who, nonetheless, adhere
to the projective paradigm of picture perception. Where the paradigm’s pro-
ponents agree is in assuming the propriety of adopting their favorite model of
inverse optics to picture perception itself.
The guiding principle of the paradigm can be presented with the aid of Al-
berti’s Window, a method for constructing realistic pictures. As illustrated in
numerous treatises on art and perception, the method requires placing a
window between the artist and the scene to be depicted. The artist’s task is to
produce a picture that will duplicate the light rays at the point where they in-
tersect the window on their way to the artist’s eye. If a picture so constructed
is then substituted for the window, it will project the same bundle of light rays
to an observer’s eye as the original object—as long, that is, as the observer re-
mains at the artist’s original location, the so-called “station point.” All this is
simply a matter of optics. *[See chapter 11, figure 11.3.]
According to the projective model, as the artist sees through Alberti’s win-
dow to the object, so the viewer of pictures “sees through” the picture surface
and locates the represented scene in space. There is a continuity, so to speak,
of the virtual space depicted and the environmental space perceived. “Seeing
Pictures, Puzzles, and Paradigms 161
through” is like “seeing” the real scene except the source of the stimulus is
not direct.
Implications
Once this projective paradigm is in place much else is taken to follow:
1. If perceiving pictures involves essentially the same processes and mecha-

nisms as perceiving objects, then pictures can be used as substitutes for real
objects in psychological experiments on vision. And such is common prac-
tice in visual research.
2. But, of course, in this context, the domain of countenanced pictures is
highly restricted. It does not include many of the things we ordinarily call
pictures. No one thinks of using caricatures, ancient Egyptian, or Cubist pic-
tures as substitute stimuli in experiments on, say, distance perception or
shape perception.
3. More significantly for our concerns, the study of picture perception itself
tends to be limited to this circumscribed domain. Only realistic pictures, pic-
tures constructed according to the rules of linear perspective, are assumed
to fall within the scope of visual theory. Accounts of the understanding and
cognitive role of other sorts of pictures, are considered tangential to percep-
tual theory. Why? Because it is hard to account for perceiving what they rep-
resent in terms of inverse optics.
4. As a first approximation, then, once the domain of pictures is so delimited,
pictures perception can be conceived along the lines of our everyday percep-
tion of the environment. In turn, the approach visual theorists take in ex-
plaining the perception of pictures depends mainly on the model of ordinary
perception they adopt.
Puzzles
If, as projectivists assume, picture perception is of a piece with ordinary per-

ception, how and why should there be any special puzzles about picture
perception? Well, all theorists recognize one problem peculiar to pictures. Al-
though most pictures represent three-dimensional scenes, there is normally
much information available indicating the picture itself is a flat surface. So
it is claimed, a conflict exists in the visual stimuli pictures afford. There is a
conflict between the two-dimensional cues of the picture’s own surface and
the three-dimensional pictorial cues. In some way the visual system must re-
solve such cue conflicts in order to perceive pictures. But how is this done?
On this matter there is little agreement. Various theorists propose models
in which the perceiver suppresses or ignores the two-dimensional informa-
tion. Others favor models which combine the two- and three-dimensional
cues forming a compromise perception of the represented space. Another
approach is to assume PURE picture perception is exhibited when or to the
extent the two dimensional cues are eliminated or not available. As with
the physicist’s “frictionless surfaces” or “isolated systems,” only in appropri-
ately idealized set-ups is it possible to get at the real processes underlying the
mechanisms at work. I think the enormous experimental literature on pic-
ture perception involving monocular vision and other reduced viewing con-
ditions, or in trompe l’oeil situations where the two-dimensional cues are
ineffective, attests to the influence of these ideas.
Of course, things get much worse once more realistic viewing conditions
are considered. For it is not simply the presence of two-dimensional cues that
raises a problem. In most everyday situations, people are not located at the
station point when viewing pictures. Unfortunately inverse optics applied to
the retinal images a picture makes available from these other viewpoints does
not project to the same scene or layout it does from the station point. Off the
unique station point the stimulus array a picture affords is said to be distorted.
This, though, raises deep questions about how perception can work when the
stimuli are abnormal and hence misleading.
Such distortions would pose less of a problem if perception were itself dis-
torted in the way inverse optics predicts. And as Gombrich (1972) has pointed
out, many theorists have adopted this “curious myth.” A myth, Gombrich
notes, because it flies in the face of ordinary experience. Pictures do not look
terribly distorted when we move off the station point.
These days, few theorists maintain a very strong distortion thesis. It is gen-
erally admitted, for example, that a picture of the Cologne Cathedral is per-
ceived, by and large, as representing the same view and shape of the building
whether the picture is looked at from the station point or from a side. This
fact, the resistance of perception to distortion, is attributed and referred to as
the “robustness” of perspective.
Robustness, while perhaps welcomed by the painter or photographer, is
quite bothersome to the projectivist. For how can perception be robust when
the stimuli are distorted? Examination of the picture perception literature

would show this issue is a or the primary focus of current research. Here too,
there is no agreement to its solution.
Some theorists deny the significance of robustness. They maintain picture
perception is not robust if the observer is deprived of inappropriate informa-
tion, in particular, cues indicating the presence of the flat picture surface.
PURE picture perception, again, is just inverse optics. Others hold the visual
system takes into account the observer’s location, recalibrates to the station
point, and then solves the projection problem along usual lines. Gibsonians,
eschewing “taking account” models of perception in general, search for rele-
vant higher-order stimuli, stimuli that remain invariant from one observa-
tion point to another. *[Picture perception is a sore point for Gibsonians. On
the one hand, they wish to treat picture perception as akin to ordinary percep-
tion, and explain the phenomena in terms of a theory of direct perception.
On the other hand, pictures, like visual images, are the sorts of perceptual in-
termediaries Gibsonians deny play any role in everyday perception. Fitting
these two assumptions together is no easy task.]
Finally, and most distressing to numerous visual theorists, the distortion/
robustness issue leads them to think it difficult, if not impossible, to make
evolutionary sense of our ability to perceive pictures. Our visual system, after
all, evolved to solve the projection problem in the everyday physical envi-
ronment. Yet we readily perceive pictures under conditions in which the
straightforward application of favored models of inverse optics break down.
Since the ability to perceive pictures could not have had independent sur-
vival value, how, they wonder, could this capacity have ever evolved?
Thus once theorists adopt the projective paradigm puzzles abound. Among
them are: a.) cue conflict, b.) cue distortion, c.) robustness, and d.) evolution-
ary coherence. There are, in fact, two other problems with the projective par-
adigm usually not recognized or ignored.
As mentioned, the inverse optics approach only seems plausible for a very
small subset of what we ordinarily call “pictures.” Caricatures, ancient Egypt-
ian pictures, Cubists pictures, and many more are not considered. Neverthe-
less, we readily perceive and understand these depictions. Their status remains
most unclear. And the rationale for splitting them off and treating them sep-
arately from perspective pictures remains in need of adequate defense.
The projective paradigm provides, too, no ready means for dealing with
various referential aspects of pictures. I have in mind here the sorts of issues
Goodman presses at the beginning of Languages of Art in criticizing resem-

blance theories of representation. He shows the resemblance model cannot
account for aspects of fictive representation, misrepresentation, or the mun-
dane fact that identical twin brothers or the several prints of a lithograph are
not ordinarily understood as representing one another. These features of pic-
torial representation do not seem to be explainable in projectivist terms.
Symbolic Paradigm
I assume everyone in this audience is familiar with Goodman’s symbolic par-

adigm of representation, and I will not review it. I wish only to call attention
to a few salient features of this approach. In contrast to the projective model,
the symbolic model assumes referential aspects of pictures are basic to their
function. Thus, pictures are treated on analogy with languages as a form of
symbolization. This idea was foreshadowed in Goodman’s (1960) article “The
Way the World Is.” There he argued both that the picture theory of language
is misguided and that adopting a language theory of pictures gives a better
account of pictorial representation.
In Languages of Art Goodman extends the thesis. Pictures along with lan-
guages are just two of a very wide range of symbolic forms. Maps, gauges,
music notation, graphs, diagrams, and the full range of what we ordinarily
call pictures (caricatures, ancient Egyptian and Cubist pictures, etc.) are given
a place.
Once the symbolic paradigm is in place much else follows. Switching the
focus of the analysis in this way provides an alternative perspective on many
of the puzzles plaguing the projectivist. It may, indeed, help resolve them. To
begin, the symbolic paradigm provides a framework for handling issues of
reference and misrepresentation, issues hard to handle while confined to
the resources of the projective model. The symbolic paradigm, moreover, does
not require the seemingly unmotivated constriction of the domain of pic-
tures and pictorial perception. It offers, instead, a motivated basis for classi-
fying symbolic systems, pictorial and non-pictorial, in terms of syntactic and
semantic properties.
The symbolic paradigm also offers a different slant on the visual problems
confronting and confounding the projectivist. Consider first the matter of
cue conflict. The symbolic model sees no need to think of the cues caused
by the flatness of the picture surface as in conflict with the three dimensional
pictorial cues. The point is obvious in the context of other forms of symbol-
ization. The sentence “Cologne is on the Rhine” makes a claim about the
environment, and in this sense has three dimensional significance. We do
not, however, think the cues informing us of the sentence’s status as a two-
dimensional written symbol in any way conflict with the three-dimensional
interpretation of its content. The symbolic paradigm suggests a similar ac-
count may be offered for perceiving pictures. We perceive a two-dimensional
pictorial symbol as having three-dimensional significance.
Along similar lines, the symbolic approach may offer help with the distor-
tion/robustness problem. Consider a sign bearing the sentence, “The Cologne
Cathedral is just ahead.” The sentence is about the Cathedral and offers in-
formation about its location. There is nothing perplexing, though, how this
sign can be taken to represent these spatial relations when the sign is viewed
from the side instead of straight-on. The stimuli and visual experiences of the
written sentence may change somewhat as we move about, but within limits
we perceive the shapes of the letters correctly. Veridical perception of the writ-
ten sentence, the representation, is all that is required to assess its content or
meaning properly.
The symbolic paradigm suggests a similar approach to picture perception.
A picture of the Cologne Cathedral may depict it as at a particular distance
and having a particular size and shape. It makes no difference to this repre-
sentational content whether the picture itself is viewed straight-on or from
off its station point. True, the stimuli the picture affords change as we move
about, and the perceptual experiences of the picture may differ to an extent.
Yet, within limits, it is possible to perceive the shapes and relationships of the
picture pretty much as they are. And that is what it takes to comprehend the
picture’s representational content.
The evolutionary dilemma projectivists confront is also given a new twist
on the symbolic model. The locus of the problem is shifted, along with pos-
sible approaches to its solution. The paradigm suggests treating the issue not
in isolation but in the context of other forms of symbolization. There is, for
example, much controversy about the correct evolutionary account of the
human language capacity. Yet no one supposes our ability to understand the
meaning of written sentences is a deep problem for an evolutionary account
of vision. Language comprehension depends on mastering the interpretive
principles of the system. The failure of written words to replicate projectively
what they represent does not stand in the way. Our ability to understand
pictures may be best understood accordingly. Appreciation of the representa-

tional content of pictures requires having the requisite skills of interpretation.
And disparities between the depiction and the depicted are no bar to this.
Humans do have an amazing, perhaps species-defining, capacity to use
many kinds of symbolic systems. Among the systems humans master are lan-
guages, graphs, and diagrams, systems whose representational schemes are
relatively unconstrained. Other systems of representation, including mime,
Greek sculpture, realistic pictures, and for that matter ancient Egyptian pic-
tures, are more systematic and in this way more constrained. Mastering the
interpretative principles of these systems would appear the easier task. If this
is so, their acquisition or development should pose less, not more, of an evo-
lutionary quandary.
Reasons for Resistance
Given all the help the symbolic paradigm seems to offer the perceptual psy-
chologist, why the reluctance to accept it?
I think this is primarily due both to a misreading of what the symbolic par-
adigm claims and to a prevalent assumption about the nature of vision. I will
look at these each in turn.
Projectivists believe because the symbolic paradigm claims pictures func-
tion like languages, the model must and does claim pictures are languages.
Projectivists, however, are convinced empirical evidence shows the mecha-
nisms involved in reading pictures, and the routes leading to the develop-
ment of this skill, are not the same as those underlying the ability to read
linguistic texts. Thus they find the symbolic paradigm untenable. (Such com-
plaints are repeated over and over in criticism of Languages of Art.) These
complaints, though, rest on a misconception. The symbolist admits, indeed
insists, depictional and linguistic systems differ in syntactic and semantic
principles. Reading pictures, therefore, is not identical with reading words.
But symbolists find here no basis for abandoning their paradigm. After all, as
the above discussion makes clear, perceiving pictures typically is “not exactly
the same” as perceiving the real three-dimensional environment. What’s
more, the simple dichotomy of symbol systems into pictures and languages
is much too blunt. It leaves no obvious place for a range of other symbolic
forms, maps, models, diagrams, music notation, and a whole lot more. The
dichotomy serves to misdirect and obscure the study of the psychological

mechanisms underpinning mastery and competence of these systems.
Projectivists tend to ignore such forms of representation and the issues
they raise for a theory of perception. Instead, projectivists merely assume the
major break among kinds of symbolic systems is between their chosen do-
main of realistic pictures and all the other types of description and depiction.
This narrow class of depictions is thought to constitute a “natural kind,” the
proper subject of investigation in the study of picture perception. But what is
the rationale and motivation for this claim besides steadfast commitment to
the paradigm? This leads to the second reason projectivists have for rejecting
the symbolist’s aid.
I think a formative intuition is the idea that understanding pictures is
something our visual system does, without cognitive intrusion. Comprehen-
sion of other kinds of depictions and descriptions involve more than the vi-
sual faculty. Extracting the representational content of caricatures, or ancient
Egyptian and Cubist pictures, like comprehending sentences in English, in-
volves cognition. By contrast, it is not necessary to interpret realistic pictures.
They are simply seen. Picture perception is something the visual system does
without the intrusion of “mental” interpretation.
The pervasiveness of this central intuition should not be mistaken for clar-
ity of formulation. There is no agreement among vision theorists as to what it
means for a process to be mental and no consensus at all where vision leaves
off and cognition begins. I have discussed this issue in detail elsewhere [1994]
and can no more than allude to some of the problems most germane to our
present concerns. *[See chapters 6, 8, and 11.]
Often in discussions of the boundaries of vision, “cognition” is equated
with conscious deliberation, and picture perception is said to be free of such
intrusion, hence, non-cognitive. This conception of cognition, however, can
not serve to support the projectivist’s intuition. For comprehending sentences
is in this sense as non-deliberative or thoughtless a process as understanding
photographs. Yet language comprehension is supposed to be cognitive, going
beyond what is given in perception.
Another prominent account of cognitive intrusion appeals to learning. In
order to comprehend a sentence, we must learn the syntactic and semantic
features of the language. Skill at extracting the representational content of real
pictures is supposed to be different. It does not require experience or practice.
The sway of this idea is reflected in the importance attached to claims that
young children, or adults from distant cultures, comprehend perspective
pictures without instruction.
This attempt to underwrite the core intuition also runs into difficulties.
First, there is much dispute over the proper interpretation of the data on un-
tutored picture perception. Second, evidence for untutored comprehension
of perspective pictures must be understood in light of evidence showing
comprehension of cartoons, caricatures, and other kinds of non-realistic de-
piction may likewise not require explicit training. Third, in contemporary
theories of vision the learned/innate distinction does not pair up with the
cognitive/non-cognitive dichotomy supposedly underlying the core intu-
ition (Schwartz 1994).
Finally, contrary to prevalent assumptions, I do not think the focus on
learning truly gets at the heart of the projectivist’s intuition. For suppose
Latin were innate and required no learning to understand. The projectivist
would still want to maintain Latin should be grouped with languages and not
pictures. And the rationale would remain as before. Language comprehension
is a two-stage process, seeing the words and then mentally interpreting them.
Perceiving pictures is supposedly different. It is a one-stage process not requir-
ing interpretation. We simply “see through” pictures to the worlds they repre-
sent. There is no need for a second stage of interpretation.
Visual theory may explain seeing words, but surely it is no part of visual the-
ory to account for how we determine what words represent. In contrast, it is the
job of vision, not mind, to perceive what pictures represent. Which pictures?
Well only perspective pictures, the rest are to be lumped with languages.
The State of Research
The above account of the competing paradigms, I believe, sheds light on the
uneasy state of research in picture perception. Usually in work on vision the
symbolic framework is disregarded, for the problems it raises are thought to
lie outside the scope of perception. If understanding a picture is like under-
standing a sentence, it is not a job for the visual scientist to investigate. At the
same time, the highly circumscribed set of issues and domain the projectivist
countenances make for a dubious research program. The projectivist studies
only perspective pictures and only up to the point where vision ends and
cognition begins. This puts the visual theorist in a bind.
If by severely restricting viewing conditions, the stimuli from picture and

object can be made identical, as they are in various experimental set-ups, then
there is nothing really left to explain about picture perception. Once outside
these non-standard confines, however, the stimuli afforded by pictures and
their represented objects diverge, the more so as motion is allowed. Then
there does seem to be distinctively pictorial phenomena for the visual scien-
tist to investigate. But the greater the discrepancy between depiction and de-
picted, the less sense can be made of the projectivist’s thesis. With each step
beyond the limited domain of perspective pictures, the paradigm loses appli-
cation. Thus the paradigm has nothing to say about the vast range of repre-
sentations ordinarily classified as pictures.
A related tension lies in the formative intuition supporting the paradigm’s
delimitation of subject matter. The basis for claiming perspective pictures con-
stitute a “natural kind” for visual science gets its life from the assumption there
is a significant demarcation between the products of vision and the products
of mind. The comprehension of written language and non-realistic depictions
is regarded as a two-stage process. Vision stops after generating an uninter-
preted sentence or depictional display. Higher level cognitive mechanisms
take over from there and extract the representational content. In the case of
realistic pictures, the story is supposedly different. The representational con-
tent is extracted by the visual system. There is no need for a second stage.
Although this one-stage/two-stage distinction is easy to avow, it is not very
easy to give it empirical content (Schwartz 1994). In earlier times, matters
were more straightforward. The sensory domain was identified pretty closely
with features thought to correspond to the retinal image. And not much pro-
cessing was assumed to take place until central, cognitive centers of the brain
were reached. Today we know there is selection, supplementation, and dele-
tion beginning at the periphery and continuing to the end. The “innocent
eye” loses its innocence at the retina. So where is the projectivist to draw a
well-motivated line?
On the one hand, the more inclusively the scope of the visual is conceived,
the harder it is to exclude the perception of caricatures, Cubist pictures, and
perhaps even sentences from its domain. This is not acceptable to the projec-
tivist. On the other hand, a minimalist understanding of the visual raises op-
posite problems. A natural minimalist position might be to draw the boundary
of the strictly visual at the extraction of basic spatial information about the
environment. This, however, threatens to collapse the projectivist’s enterprise.
To treat a flat painted surface as a picture requires more than seeing it as a col-
ored object of a particular size, at a certain distance and direction. It must be
perceived not simply as an object in the world but as a representation. Here
commitment to the projective paradigm gets in the way. Inverse optics does
not readily accommodate many of the important aspects of picture percep-
tion highlighted by the symbolic paradigm. And this I believe is a major rea-
son for the uneasy state of research in picture perception. For stripped of
“interpretation,” of “reading,” of the accretions of experience and all else that
constitutes or contributes to referential and representational significance, a
picture cannot function to guide behavior, inform cognition, or enhance aes-
thetic experience. Or in Goodman’s terms, the projective paradigm has trouble
accounting for the role pictures play in making and remaking our worlds.
Note
* This paper is based on ideas further explored in “Two paradigms of picture percep-
tion: The uneasy state of research on picture perception,” Report de Forschungsgruppe:
Perception and the role of internal regularities of the physical world am Zentrum fuer inter-
disziplinaere Forschung der Universtaet Bielefeld, 1997.
References
Gombrich, E. H. (1972). “The ‘What’ and the ‘How’: Perspective Representation and
the Phenomenal World,” in R. Rudner and I. Scheffler (eds.), Logic and Art, Indianapo-
lis: Bobbs-Merrill, 129–149.
Goodman, Nelson. (1960). “The way the world is,” Review of Metaphysics 14, 48–56.
———. (1968). Languages of art, Indianapolis: Bobbs-Merrill.
Schwartz, Robert. (1994). Vision: Variations on Some Berkeleian Themes, Oxford: Black-
well Publishers.
Prescript 11
One problem haunting the symbolic paradigm of picture perception is often

thought to be insurmountable. Even if it can accommodate some of the intu-
itions underlying resemblance accounts, it is widely believed that it cannot
in the end capture a defining feature of pictorial representation, its “visual-
ity.” Although we use our eyes to see written text, there is a big difference be-
tween what goes on in such cases and what takes place in perceiving pictures.
Understanding written text is a cognitive act. It is not enough to see the
words; they must be read. Comprehending pictures is thought to differ. Pic-
ture perception is in some sense “direct” (of course, not in quite the sense dis-
cussed in section II on perceptual inference). It does not require an additional
cognitive act of “reading.” Pictorial information is accessed by strictly visual
processes. Herein lies the distinctive visuality of pictures. Furthermore, it is
only by failing to address the visuality issue that the symbolic paradigm can
dodge the troubling perceptual puzzles discussed in chapter 10.
Chapter 11 explores how the symbolic paradigm may capture what is sal-
vageable of the visuality intuition without abandoning its own principles.
This response to the challenge, however, does require rethinking claims about
the dividing line between vision and cognition. (See chapters 6 and 8.) I have
contrasted the symbolic paradigm with resemblance theories in general and
with projectivist versions of it. In this essay, I call such alternatives “surro-
gate” models. For my purposes the labels are pretty much interchangeable,
although each suggests a different emphasis.
11 Vision and Cognition in Picture Perception*
In recent papers (1997, 2002) I have explored how two seemingly conflicting
paradigms inform the conception and study of picture perception. The dom-
inant paradigm, one especially favored by vision theorists, claims that seeing
a pictorial representation of an object is, with qualifications, like seeing the
object itself. The picture, being a geometrically sanctioned projection of its ob-
ject, resembles it, or otherwise serves as a mimetic surrogate, “re-presenting”
what it depicts (Danto 1982). Accordingly, pictorial representation is at its
best when, as in trompe l’oeil paintings, viewers can not tell the picture, the
stand in or substitute, from the real thing.1 An alternative paradigm, the
symbolic model, championed most forcefully by Nelson Goodman (1968),
focuses attention on syntactic and semantic features of pictures. On this
account, pictures are importantly allied with other forms of representation,
including languages, maps, and music notation, and picture perception is to
be understood in this context.
In my earlier work, I attempted to show how adopting the symbolic ap-
proach could provide a framework for explaining several persistent problems
in the study of picture perception—a topic I will return to later. I also main-
tained that vision theorists’ reluctance to embrace this approach often rests
on a misunderstanding. Although the symbolic paradigm does stress that
pictures, as representations, function like languages, it does not claim they
are linguistic symbols. The model, in fact, insists there are significant syntac-
tic and semantic distinctions between linguistic and pictorial systems.
Critics of the symbolic paradigm, nevertheless, remain skeptical, tending
to resist efforts at a rapprochement. The symbolic model, they say, fails to cap-
ture a core intuition about pictorial representation, its “visuality.” Picture
perception is a matter of vision, whereas comprehending languages and other
symbol systems depends on cognition. Being stand ins or re-presentations,
we are able to simply “see” pictures. By contrast, we must read or interpret

sentences in order to comprehend them. The former concerns the doings of
sense; the latter requires involvement of the mind or intellect.2
This conception of the difference between pictorial and other modes of
representation, however, runs into serious obstacles. First, pictures are obvi-
ously not unique in tapping the resources of vision. We see written words,
graphs, and music notation as well. Less noticed, some pictures, along with
some maps and sculpture, can be explored by touch, and so-called “haptic
pictures” are designed with this in mind (Kennedy 1993). So being a “visual
display” or being dependent on visual processing cannot be what distin-
guishes pictures from other modes of representation. Second, the distinction
between the visual and the cognitive is not only unclear and empirically sus-
pect (Schwartz 1994, 1996) it does not seem up to the job. Standard proposals
for drawing a visual/cognitive boundary fail to support the surrogate theo-
rist’s core intuition.
For example, it is often said picture perception is non-cognitive, since it
does not involve active deliberation or contemplation. But the same holds for
everyday cases of language comprehension. Without delay, pondering, or
conscious inference we read and understand immediately most sentences we
encounter. Identifying the cognitive with the learned and the visual with the
innate will not work either. Many animal signal systems are instinctive, but
they are not pictorial. And were humans equipped at birth with Latin, or as
some claim “Mentalese,” it would not mean these representational systems
were pictorial.3
Still, I think, proponents of the symbolic model can do more to accommo-
date the intuition of there being something more “visual” about pictures
than natural languages, music notation, and various other forms of repre-
sentation. And it is to this task I wish to turn my attention. The account I shall
sketch makes use of ideas Goodman employs in his taxonomy of symbolic
systems. Goodman argues that pictures, in contrast, say, to sentences in
language or scores in music notation, belong to dense schemes of represen-
tation. Pictorial symbols are analog, while English sentences constitute a
digital system.
Goodman further claims pictorial schemes are comparatively replete. Many
more of the symbols’ own properties function in determining its representa-
tional content. A given line may be understood as a graph, plotting the height
of a mountain range, or as a picture of the same mountain (figure 11.1). Read
Vision and Cognition in Picture Perception 175
Figure 11.1
Figure 11.2
as a graph, the thickness of the line, its color, and background have no sig-
nificance. Interpreted as a picture, all these properties go to constitute the dis-
plays representational force. Notice something phenomenal, akin to a Gestalt
switch or aspect change, occurs when shifting between the two readings. And
experience of the line takes on another much different character when read
in the context of figure 11.2.
Simply making a graph more replete, however, will not turn it into a pic-
ture. Nor will assigning representational significance to the background do
the trick. For if these additional features of the graph stand for measures of
temperature, mass, and electrical charge, the display will still lack the “visu-
ality” associated with pictures. What more is needed?
Perhaps if a display is to function as a representational picture, the entire
surface of the display must have spatial significance. Each point, whether
marked or blank, is to be understood as mapping onto a spatial place.4 On this
account, the more replete graph just described will not count as a picture. The
thickness of the line now has significance, but it represents degrees of heat
not spatial locations. When, instead, the line is read as a picture of a moun-
tain range, the dimensions of the line, as well as the surface points above and
below, even the blank ones, take on spatial meaning. A list of numerical triplets
denoting spatial coordinates, though, is not a picture. It has the right kind of
significance for picturing, but lacks the appropriate analogue density and
repleteness to function pictorially.5
Requiring each picture point to have spatial significance does not mean
that spatial layout is the main or most important information pictures con-
vey. Pictures can and do represent much more, including non-spatial proper-
ties. My claim is only that the seeming visuality, or in Wollheim’s (1974) terms
seeing in, aspect of picture perception lies in the extent to which a picture sur-
face is given a spatial reading.6 It is in this way that seeing pictures is like or re-
sembles the everyday perception of real objects and scenes. Normally when
we look about the environment, the points comprising our visual field are
each given spatial location. Although, here too, assignment of spatial loca-
tion does not exhaust the information vision provides. And as with picture
perception, the placements in space may be imprecise, relative not absolute,
and even indeterminate.
Does this added requirement of a spatial reading collapse the distinction
between the surrogate and symbolic paradigms? I think not. Important dif-
ferences remain. In order to function as a representational picture, the sym-
bolic model now does require assigning or “projecting” spatial significance to
the display. But it does not require that a picture be a projection from any ob-
ject to the picture surface. This offers technical advantages in accounting for
the referential or denotive features of pictures—problems Goodman and many
philosophers find of considerable concern. For example, it enables the sym-
bolic model to sidestep difficulties in dealing with fictional representation
(e.g. unicorn pictures) and general representations (e.g. pictures accompany-
ing dictionary entries), cases where there are no actual objects from which
the pictures are projected.
The symbolic paradigm can also handle cases of misrepresentation in a
more natural manner than the surrogate model. Just as a sentence describing
Bill Gates may be inaccurate, so a picture of Gates may incorrectly character-
ize him. The picture, faulty as it may be, refers to and depicts Gates. It repre-
sents Gates; it does not represent some other person the picture might better
copy or resemble. Alternatively, multiple prints of a woodcut of Gates repre-
sent the man, not the other strikings. This in spite of the simple identity pro-
jection from one print to the next.
Multiple representations and misrepresentations, along with fictive and
general representations, make up a large part of the pictures we encounter.
And as Goodman has argued, satisfactory treatment of these cases is impor-
tant if we are to understand the role pictures play in informing the mind and
guiding behavior. This would seem to require attention to the referential fea-
tures of picturing, features anti-interpretivist approaches tend to ignore.
More significantly for present concerns, the surrogate paradigm faces vari-
ous perceptual problems the symbolic approach can more readily avoid. The
symbolic paradigm has no need to maintain that only one or a small circum-
scribed group of optically sanctioned projection schemes is required for pic-
torial representation. Nor need the model presume there is a singular, visually
correct way to depict space. Surrogatists, however, seem committed to the
idea that what distinguishes pictures from other forms of representation is
that pictures resemble, copy, or otherwise serve as visual stand ins for what
they represent. At the same time, surrogatists hold that only certain kinds of
projective displays, primarily those constructed according to the rules of lin-
ear perspective, render space mimetically. Only these renderings depict the
world as it is seen, non-conventionally describing what they re-present.
But then, Egyptian, Haitian, and Cubist renderings are problematic, as are
cartoons and caricatures. They, along with much else found in museums and
magazines, are in some sense not “genuine” pictorial representations, since
they do not look like, re-present, or provide the same cues or stimulus infor-
mation as the objects they depict. Such renderings are not full-fledged stand
ins. They, perhaps, are better understood along the lines of languages, maps,
and graphs as arbitrary, conventional representations. For they do require in-
terpretation to be understood, and in so doing likely cross the visual/cognitive
border. Accordingly, such pictures are not appropriate to take the place of real
objects or layouts in psychological experiments. Photographs and depictions
done in perspective pretty much make up the domain [Schwartz 2002]. Per-
spective pictures form a “natural kind” among representations and thereby
constitute a natural kind for vision science.
The symbolic paradigm is under no similar pressure to relegate Egyptian,
Haitian, Cubist, cartoon, and caricature representations to linguistic or quasi-
pictorial status. The syntactic and semantic properties of these representa-
tions serve to group them with other sorts of pictures, as well as distinguish
them from linguistic systems of representation. And the added requirement
of a spatial reading attempts to account for the particular “visuality” repre-
sentational pictures possess.
In contrast, surrogatist theorists’ grounds for distinguishing pictures from
non-pictorial representations and for placing Egyptian, Haitian, Cubist,
cartoon, caricature and other “non-resembling” renderings among the pic-

torial remain unclear. Likewise, their basis for limiting the natural kind of
full-fledged pictures to perspective renderings remains in need of explication
and justification. Often surrogatists are willing to sidestep these issues, sup-
posing an intuitively obvious visual/cognitive dichotomy can support the
distinctions they draw. But earlier we saw that reliance on this dichotomy has
its own difficulties.
Another response to these matters takes its cue from the fact that light rays
coming from pictures drawn in perspective can, under appropriate viewing
conditions, duplicate those from the scenes depicted. This is thought to ex-
plain what is special or “natural” about perspective depictions. Some variant
of Alberti’s Window (figure 11.3) is often used to make the point. A correct
surrogate is a transcription onto an opaque surface of the light rays that would
strike a window through which the represented scene might be viewed. If
a picture is so constructed, then the light rays reaching the retina from the
depiction and the depicted will be identical. Each projects the same two-
dimensional image onto the surface of an eye located at the station point.
Viewing the picture substitute, the stand in, is comparable to viewing the ac-
tual layout through a window. It is, therefore, obvious why such pictures
resemble what they represent, and equally obvious why comprehending
these sorts of pictures is “purely perceptual” or not dependent on “cogni-
tive” functions. The ideal mimetic substitutes, trompe l’oeil paintings, clinch
the case.
Station
point
Figure 11.3
Figure 11.4
(a) (b)
Figure 11.5
Although the geometry of the Alberti’s Window analysis is not to be de-

nied, adopting this account of the special status of “natural kind” pictures,
raises as many questions as it resolves. For as this geometry also makes appar-
ent (figure 11.4), every perspective picture is theoretically ambiguous. An in-
finite number of different objects, at different distances and orientations will
project the same bundle of light rays to the eye. But typically a picture is taken
to represent only one of these physical layouts, not all of the scenes its pro-
jected light rays mimic with total accuracy. The conditions employed in the
classic Ames chair demonstration can serve to underscore the issue. From a
certain vantage point, both a real chair and a set of optically aligned but non-
contiguous pieces of wood, will project matching images on the eye (figure
11.5). In turn, Alberti Window renderings of these two setups will be identical.
Nonetheless, in ordering furniture from a catalog, we knowingly or unknow-
ingly discount the possibility we will be shipped six odd pieces of lumber.7
Conversely, any set of marks on a picture surface is an Alberti’s Window
copy of an unbounded number of possible real world scenes. Every picture,
from Photo-realist to Cubist to the most abstract drawing, is a correct linear
perspective rendering of some, indeed many, possible spatial arrays. Hence a

picture can be deemed to meet or fail to meet the canons of linear perspective
only with respect to assumptions about what it is taken to represent. Distin-
guishing “genuine” mimetic pictures from other representations presup-
poses some account of their referential or denotative underpinnings.
These days a related set of geometric problems attracts more attention
among vision theorists than the puzzles of pictorial ambiguity. Light rays
striking the retina from the environment and from a picture drawn in per-
spective are identical only when the picture is viewed with one eye, and the
eye is located at the station point of the projection. Look at the picture with
two eyes, or with one eye anywhere off the station point, and the light rays
from the picture and those from the real scene no longer match. The discrep-
ancy is more pronounced with many of the motion-dependent sources of
spatial information Gibsonians and other contemporary vision theorists
champion. For example, in moving toward real objects the expansion rate of
the retinal image is greater for near things than far. This is not the case with
pictorial objects located at different depicted distances; the rates of expan-
sion are the same.
Concern with stimulus differences resulting from variations in viewer
location sets the stage for much current work in picture perception (Rogers
1995). The station point is taken as the correct or normal place to view pic-
tures. And the geometry and optics associated with this observation point
are canonical. Once off the station point, the light rays a picture projects do
not match and are often said to distort what they represent. This, though, is
found puzzling for vision theory, since people tend not to perceive pictures as
distorted when viewed from these alternative locations. They usually under-
stand the representational significance of a picture much as they do when ob-
serving it from its station point. An accurate perspective picture of the White
House, for example, provides much the same information and experience of
the building’s facade whether the picture is viewed straight on or from off to
a side. People, in fact, move back, forth, left, right, even up and down, to im-
prove their appreciation of pictures. Thus, a major challenge facing the visual
theorist is to explain this so-called “robustness” of perspective. Notice, ap-
peal here to the notion of “resemblance” is not so much an answer to the
problem of robustness as it is a restatement of the difficulty. For the issue then
is to explain how and why a picture can resemble its referent, when the stim-
uli projected from both are quite different.
These days a converse station point phenomenon has also been receiving a
lot of attention. Although perspective renderings are by and large robust, the
perception of certain features of some perspective pictures do not remain con-
stant when a viewer moves laterally with respect to the picture surface. It has
been long known that the eye gaze of a depicted person will often appear to
follow a viewer as he or she moves left and right. The Mona Lisa is a classic ex-
ample of this phenomenon. And the famous World War I “I Want You” poster,
in which Uncle Sam’s finger appears to point directly at viewers no matter
where they are standing, is another prime example. Perceptual experiences of
real faces and fingers, however, do not alter in these ways in response to ob-
server movements. Real things do not follow you about. Instead, different
portions of the object or scene come into view. So again there is a discrepancy
between picture perception and ordinary perception that needs to be squared.
Surrogate theorists also must acknowledge and explain why some pictures
drawn in correct perspective, nevertheless, do not look right to viewers. For
example, representations of spheres toward the periphery of a scene are found
more acceptable, appear less distorted, if they are drawn as circles not ellipses.
Yet, real spheres, so located with respect to an observer, project elliptical not
round images on the retina (Pirenne 1970). In addition, people frequently fail
to notice anything amiss with pictures that violate the canons of linear per-
spective. Most viewers sense nothing strange or distorted, nor find it difficult
to understand engineering drawings done according to a scheme of isometric
projection, a system in which parallels perpendicular to the picture plane do
not converge. And it takes time and often instruction for many viewers to
appreciate the “distortions” in Cezanne’s or Van Gogh’s renderings of space.
These assorted phenomena of picture perception not only pose empirical
challenges to the surrogate paradigm, they go some way in undermining its
rationale. They make it harder to sustain a very strong claim that perception
of pictures does not tap resources beyond those employed in seeing the ordi-
nary physical environment. Some additional help must be recruited. Fur-
thermore, to remain within the spirit of the surrogate paradigm, this help
should be “visual” not “cognitive.”
While these station point and related phenomena do require explanations,
they do not pose the same or as pressing a problem to the symbolic approach.
If pictures are understood as allied with other forms of symbolization, their ro-
bustness in response to alterations in viewpoint might be expected. No one is
surprised that words maintain their significance when looked at from varying
angles. There is no difficulty, as long as the letters are not thereby distorted
and incorrectly perceived. And talk of a canonical, non-distorting, or correct
point to view words seems strained. Similarly, it may be held that a suitable
reading of a picture depends on seeing the picture itself, the representation,
correctly. If, as with letter recognition, this can be done with reasonable accu-
racy from different locations there is no reason why the relevant assignment
of spatial meaning to points on the picture surface should be compromised.
An account of the effects of motion might follow the same line. The expan-
sion rate of the retinal image associated with moving toward a printed word
does not alter how the word is understood. The resulting changes in the
stimuli or look of the word are accorded no representational significance. The
story might be the same with pictures. The fact that near and far depicted
items expand at the same rate does not importantly alter perception of the
picture surface. So it does not alter or distort our appreciation of the sizes and
distances represented.
This approach to station point problems gains support from and fits in
nicely with the more inclusive conception of pictorial representation the sym-
bolic paradigm promotes. In focusing on the robustness of perspective ren-
derings, there is a tendency in vision studies to overlook the fact that viewing
angle and distance also have little effect on perceiving pictures that violate
surrogatist criteria. The perception and understanding of cartoons and cari-
catures, as well as Egyptian, Haitian, and Cubist paintings are robust. Their
representational significance, like that of pictures done in linear perspective,
remains constant with changes in viewing angle and motion. Yet the very
idea of a station point may be as otiose with many types of non-mimetic pic-
tures as it is with words. Then again, I think the phenomena of perspective ro-
bustness would not itself appear so puzzling, if less significance were accorded
the view and geometry associated with this singular point.
Admittedly, perception of the picture surface is not always constant. The
experienced shifts in direction of depicted eye gaze, fingers, and other objects
that accompany movement are quite noticeable. Comparable movement
produced changes in perceived orientation do not usually accompany the ap-
pearance of written text.8 But allowing for these sorts of perceptual differ-
ences between pictures and language need not undercut the symbolic model.
Remember, the model does not claim that pictures are just like words. It insists
that the syntactic and semantic properties of these systems are quite different,
and there is no reason why such differences should not have perceptual reper-
cussions. Indeed, better understanding of these orientation phenomena may

require appeal to the resources the symbolic paradigm makes available.
Consider a picture of the star-crossed lover Juliet looking straight ahead,
out of the canvas, at Romeo. Romeo is in the foreground, back toward the
viewer, with Juliet’s face appearing over his shoulder. Clearly a mobile ob-
server, flattered that Juliet’s gaze is not fixated on her true love but follows the
viewer about, misunderstands the painting. Correct interpretation of the rep-
resentational significance of the picture requires overcoming the temptation
and reading Juliet’s gaze as firmly and solely directed at her beloved. By con-
trast, what makes the Uncle Sam poster so attention getting and intriguing is
that it purposely forces us to confront this very sort of visual problem.
Unfortunately, the twofold nature of picture perception these orientation
cases highlight is obscured by the surrogate paradigm. The Alberti Window
metaphor used to elaborate the model makes it more difficult to separate the
issues properly. If looking at pictures is like looking at objects and scenes
through a window, it is easy to run together the pictorial space represented
with the ambient space of the viewer. This running together of spaces is, after
all, what should be and is done when looking at the real world through a win-
dow. And such a blending of real and depicted space is just what happens
with trompe l’oeil pictures. Implicit or explicit acceptance of the window
metaphor, I think, also contributes to some confusions found in discussions of
experiments aimed at determining where subjects say pictured objects are ori-
ented with regard to their own location. A road running down the center of
a painting is perceived to point at the viewer when seen straight on. Move to
the left, and the road continues to come at the viewer, although it now seems
to run right to left. Move to the right and comparable experiences occur. And
it is possible to measure these rotation effects fairly precisely. All this is quite
fascinating and calls out for an explanation. Nonetheless, I would claim the
significance of such phenomena must be understood in proper context.
Treating a display as a representational picture does require giving a spatial
reading to its surface. It does not require the spatial assignment be to places
in or located with respect to the viewer’s current environment or with respect
to the surface of the physical picture. Correct interpretation of the road scene
depends on situating the road with respect to the other items depicted. The
picture informs us about the represented space, not our own. As with the pic-
ture of Juliet and Romeo, failure to pay heed to this distinction can lead us
astray. Undoubtedly, the picture does not represent the road as wandering back
and forth, or as pointing in all directions at once, or as having any particular

spatial relationship to the actual picture surface.
Earlier I mentioned that for the surrogatist, trompe l’oeil pictures are the
paragon of success, the best to employ as stand ins, as re-presentations. For the
symbolist these pictures are something of an anomaly. Their deceptive suc-
cess rests on abnegating the viewer’s appreciation of the picture surface. They
are “mistakes of the eye,” because they fool the perceiver. The picture is not
taken or understood as a representation, a symbol, but is mistaken for the real
thing. The situation is comparable to that of the much studied experience of
subjects made to look one-eyed through an aperture at a picture done in per-
spective. Under these conditions, the light rays from the picture match those
that would be projected from the represented layout. Not surprising, subjects
frequently are unaware they are looking at a picture, and believe they are per-
ceiving a real object or scene. No additional mechanisms are brought in to in-
terpret the stimuli as a representation. The picture is simply “seen,” the result of
sense perception, a matter of “vision” not “cognition.”
Now I have indicated before being skeptical of drawing a sharp, principled
boundary between the visual and the cognitive, and thus I have some diffi-
culty knowing what this last sentence actually claims. Perhaps it is enough to
agree in the case of such deceptive representational set-ups no additional re-
sources are recruited over and above those employed in normal seeing. We can
leave unresolved whether or the extent to which normal seeing is itself non-
cognitive. Still, deceptive representations and set-ups aside, more everyday in-
stances of picture perception do involve the viewer’s appreciation of the picture
itself as an object in the environment, appreciation of the picture surface as a
bearer of a symbol, and appreciation of a represented pictorial space and objects
contained therein as distinct from the space surrounding the physical picture.
Proponents of the symbolic paradigm insist any adequate account of pictorial
representation must pay proper attention to these factors and their semantic
and syntactic consequences. For the viewer’s appreciation of these features of
the perceptual situation go to determine the reading and interpretation given.
At this point, the surrogate theorist will rise with an old objection. “Isn’t it
clear that pictures, being mimetic stand ins, are different from languages, mu-
sic notation, and the like? Pictures are simply seen, they are not read or inter-
preted.” If this means there is a “visuality” to pictorial representations other
systems of representation lack, proponents of the symbolic paradigm, I ar-
gue, can agree. If the surrogate theorist insists the process of assigning spatial
significance to pictorial representations should not be called “reading” or “in-

terpretation,” but just plain “seeing,” the dispute may be only a war of words.
I am not convinced anything more does hinge on making the distinction. In
light of the help the symbolic paradigm may offer in developing a more com-
prehensive theory of picture perception, it would be a shame if the two sides
are kept apart over what may largely be a verbal dispute.
Notes
* Thanks to Carl Zuckerman for detailed criticism. Versions of this paper were pre-
sented at a memorial conference for Nelson Goodman at Harvard University and at the
Center for Interdisciplinary Research, Bielefeld, Germany.
1. I call this the “surrogate” paradigm. “Anti-interpretivist” might serve my purposes as

well. In any case, as I am using the label, what links surrogatists is their rejection of the
symbolic paradigm, not their sharing some single model of picture perception. Tellingly,
differences among vision theorists’ accounts of picture perception largely reflect dif-
ferences in their approaches to non-pictorial vision.
2. Variations of this criticism are leveled against the model not only by vision theorists,
but by art historians and philosophers who balk at what they take to be the symbolic
paradigm’s conventionalist implications.
3. Much is often made of evidence suggesting that infants (Hochberg and Brooks 1962)
and people from non-Western cultures (Derogowski 1989) can understand pictures
without training. My point here is not to challenge these empirical findings but to call
attention to the need to separate issues of learning and innateness from claims about
the form and conventionality of symbols.
4. For ease of exposition I talk of full spatial readings. It would be more accurate to say
that the visuality of pictorial representation is a function of the degree to which a dis-
play is so interpreted. Mappings typically are not to locations in the ambient space of
the physical picture, nor necessarily to any real world locations.
5. Similar considerations may help distinguish haptic pictures from braille linguistic
symbols.
6. The spatial reading requirement is meant only to capture various intuitions about
the visuality of representational pictures. It is surely not sufficient for distinguishing
“realistic” from “non-realistic” pictures (See Schwartz 1974).
7. I use the Ames chair to highlight problems about the “appropriate” rendering of
scene (b) and the station point assumptions on which it depends. I am not here ques-
tioning the “generic viewpoint” constraint, thought to resolve some cases of ambigu-
ity in ordinary perception (Hoffman, 1998).
8. It might, however, if a pointing finger like Uncle Sam’s were a letter or word in some
language.
Bibliography
Danto, A. (1982). “Depiction and Description.” Philosophy and Phenomenological Re-

search 43, 1–18.
Deregowski, J. (1989). “Real Space and Represented Space: Cross-Cultural Perspec-

tives.” Behavioral and Brain Sciences 12:1, 51–74.
Goodman, N. (1968). Languages of Art. Indianapolis: Bobbs-Merrill.
Hochberg, J. and V. Brooks, (1962). “Pictorial Recognition as an Unlearned Ability.”

American Journal of Psychology 75, 624–28.
Hoffman, D. (1998). Visual Intelligence. New York: W.W. Norton.
Kennedy, J. (1993). Drawing and the Blind. New Haven: Yale University Press.
Pirenne, M. (1970). Optics, Painting and Photography. Cambridge: Cambridge Univer-

sity Press.
Rogers, S. (1995). “Perceiving Pictorial Space.” In W. Epstein and S. Rogers (eds.) Percep-
tion of Space and Motion. Boston: Academic Press, 119–63.
Schwartz, R. (1974). “Representation and Resemblance.” Philosophical Forum 4,

499–512.
———. (1994). Vision: Variations on Some Berkeleian Themes. Oxford: Blackwell Publishers.
———. (1996). “Directed Perception.” Philosophical Psychology 9, 81–91.
———. (1997). “Pictures, Puzzles, and Paradigms.” Philosophia Scientiae 2, 231–42.
———. (2002). “Two Paradigms of Picture Perception” in Perception Theory: Conceptual

Issues, R. Mausfeld and D. Heyer (eds.). New York: Wiley, pp. 257–70.
Wollheim, R. (1974). On Art and the Mind. Cambridge: Harvard University Press.
IV Missing the Real Point
Prescript 12
Studies of object perception, its origin, and its onset are the focus of much at-
tention in perception and developmental psychology. Impressive experi-
ments have been run and interesting evidence amassed that is thought to
speak to these concerns. Much of the work, however, makes little effort to be
precise or justify the “object” concept employed. Chapter 12 explores per-
plexities that arise from this laxity.
W. V. Quine has a widely cited, formally clear criterion for determining the
ontological commitment of discourse. His notion of an ontological object,
though, is not what perception theorists and developmental psychologists
mean by the term “object.” Quine’s criterion is purposefully unrestricted and
indiscriminate, it countenances everything that is or exists. This “object” no-
tion is far too inclusive for the perceptual issues being studied. For such pur-
poses it seems necessary to distinguish real objects from the ontologically
possible, but psychologically spurious ones. Absent this distinction there is
no well-defined subject matter for research to confront. Narrowing down the
domain of objects to the “real” has its difficulties. The options examined in
this essay face two serious problems: (1) they do not exclude as real objects
things that are thought to be spurious and (2) central claims said to be im-
portant and peculiar to object perception also hold in perceiving spurious
items. When these problems are confronted, recent criticisms of Piaget’s
views on object perception are no longer as telling as they are frequently
thought to be.
Those working in the field of object perception will undoubtedly feel I have
missed the “real” point. The editors of the volume who commissioned the
article responded so to the first draft of this paper. I rewrote it in an effort to
allay their fears and concerns. The quote below, from the editors’ introduc-
tion, is an indication of where matters were left.
190 Missing the Real Point
This chapter generated by far the most debate among the author and editors. . . . Who
wants to be told their focus of study may not be coherent? Like Justice Stewart’s crite-
rion for recognizing obscenity, we all think we know what an object is when we see one.
Yet, as Schwartz suggests, the core notions are not at all simple or settled.
P. Kellman and T. Shipley
12 The Concept of an “Object” in Perception and Cognition
Object recognition . . . is often taken as the primary goal of a visual system. Surpris-
ingly, a significant obstacle in the path of understanding object recognition is that we
lack a precise definition of what constitutes an object. Without such a definition, how
can we possibly know where we are headed? Furthermore, any computational theory
of object recognition becomes impossible, for what is to be computed?1
Whitman Richards
In the theory of vision, object recognition has long been a topic of interest.
For today’s computational theorists it is a core area of study. As Richards indi-
cates, this computational work has brought with it considerable pressure to
find a precise definition of the notion of an “object.” The last number of years
has also witnessed an explosive growth of research in developmental psy-
chology concerning the perception and conception of objects. Beautiful
experiments have been conducted on ever younger infants attempting to de-
termine their earliest awareness or appreciation of objects. The current trend
has been to set the date closer and closer to birth. Many developmentalists
assume, in fact, that the only way to account for the phenomena they have
discovered is to assume the concept of an “object” is innate. But these devel-
opmental claims, like those of vision theorists, would seem to presuppose
some acceptable characterization of objecthood.
What is it then to be an object? In turn, what is it for an organism to perceive
or conceive of something to be one? Finding a precise, computationally sat-
isfactory specification of objecthood has proven to be an elusive task for vi-
sion theorists. For example, even if it is agreed that a car is an object, is its
radio also an object or is it only a part of the car? And what is the status of the
radio’s volume control knob or the left half of this knob? Do they each fall
under the concept “object?” Similarly, consider the car’s fender. Is it a part of
an object when attached to the car but an object in its own right when on
the warehouse shelf? And is the dent in the fender itself an object or merely a
feature of one? To raise these questions is enough to see the extent of the
difficulties, and this without pressing for answers to questions about the ob-
jecthood of non-solids (e.g. the gasoline in the tank), or two-dimensional
items (e.g. decal emblems and the car’s shadow), or conglomerations of non-
continuous bits of stuff (e.g. the collection of the car’s tires), or extended con-
tinuous surfaces (e.g. garage walls, the driveway on which the car now rests,
and roads traversed).2
An Answer
In light of the remarks above, it might be thought we lack any plausible ac-
count of what it is for something to be an object. Not so! There is a perfectly
reasonable characterization of objecthood that is as simple and clear as it is
unhelpful. From an ontological point of view, everything that is, is an object.
And as W. V. Quine (1953, 1960) has forcefully argued, all it may mean to
treat something as an object ontologically is to be willing to quantify over it in
discourse. “To be is to be the value of a bound variable” is his motto.
Now there are those who rebel at the idea of granting existence and hence
object status to non-spatial items, assuming that everything that is or exists
must be material and observable. Herein lies the seeds of classic metaphysical
debates over the ontological status of abstract entities, such as numbers and
properties, or mental items, such as dreams or qualia. For our concerns these
controversies can best be ignored. Little will be lost by stipulating that the ob-
jects of perception and conception are all material spatial things.3
Still, this narrowed domain is not what theorists have in mind when they
talk of the visual segmentation of the world into objects or attribute to new-
borns an awareness of objects. For the domain of the spatial includes gerry-
mandered parts and sums as well as temporal segments of the material world.
Ontologically speaking, not only may a chair count as an object, but so can all
of its pieces, from the atomic to the large. In addition, the spatially separated
bits of carpet on which a chair stands, a chair plus a dachshund, or the com-
pound of the tip of a dachshund’s nose for two minutes and a chair for a
moment before, may all be treated as objects of quantification. But if any
assemblage of spatial material, or instantaneous temporal slices thereof, can
be understood to be an object, the computational task of vision remains
The Concept of an “Object” 193
under-specified and the developmental importance of studying the onset of

object awareness is obscure.
The Problem
As generally understood, the problem for psychology is not to explain whether,

when, and how organisms perceive and conceive the full range of objects that
may be said to have “ontological” status. Instead, the intuition underlying
visual and developmental studies seems to be that there are real objects (e.g.
individual chairs and particular dachshunds) and there are parts, sums, and
temporal slices of what spatially exists (e.g. a chair plus dachshund, the dented
portion of a fender, or the sum of scattered carpet pieces) that are not objects,
or are spurious ones at best. The objects of perception and conception are the
real ones, and the goal of psychology is to explain when and how we perceive
and conceive of them.
This way of putting matters, however, is not entirely satisfactory. For we not
only see individual chairs and dachshunds, we do see chairs together with
dachshunds, we do see gerrymandered spatial sums and parts, we do see the
separated bits of carpet where the chair’s legs have left their impression, and
we do see temporal slices of them all.4 What’s more, if asked, we can identify
and label these things as such. So the idea can not be that we are only able to
see and categorize “real” objects. It must be that although we can see the di-
mensions, respond appropriately, recognize, and even talk about spurious
things, we do not perceive or conceive of them as objects. But what does it
mean for something to be a real object and to be so taken?5 And are volumes
of gasoline, shadows, “natural” collections of real objects, and extended sur-
face areas objects in this sense? Or are these items, like the gerrymandered,
not to be seen as objects?
Quine’s criterion, by itself, is too wide to provide answers to these ques-
tions, since anything, including the spurious, can be an ontological object.
The criterion also seems too narrow in that to quantify over an item in dis-
course (or partake in discourse best interpretable as so doing), presupposes
having a reasonable degree of linguistic competence. Both vision theorists
and devlopmentalists find this requirement unsuitable for their needs. For
vision theorists, perception in general does not depend on having a lan-
guage. Non-linguistic animals, human and non-human, are thought to per-
ceive objects. Developmentalists, too, reject a language prerequisite. Their
experiments with newborns would be pointless if the appreciation of objects

required infants to have relatively sophisticated linguistic skills. Thus, even if
Quine is right about what it is to speak of objects, his criterion of ontological
commitment does not seem to offer an acceptable analysis of the psycholog-
ically relevant notions of “perceiving” or “conceiving” objects. Still, I think
more careful and detailed attention to Quine’s concerns about reference and
ontology can help illuminate the issues.
In fact, in his account of ontology, Quine finds it useful to introduce the no-
tion of a “body,” and this notion matches reasonably well with the intuitive
concept of an “object” found in many psychological studies. According to
Quine (1973), bodies are segregated, bounded areas of the spatial-temporal
environment that display “continuity of displacement, continuity of visual
distortion, (and) continuity of discoloration” (p.55). Thus cars, chairs, and
dachshunds are bodies, while gerrymandered parts and sums of them are not.
Quine claims that our basic ontology is one of bodies, and that our initial and
firmest grip on the whole idea of “ontology” is in connection with our com-
mitment to physical bodies. Also, Quine readily admits that innate disposi-
tions and Gestalt principles of perceptual organization probably make bodies
salient and a major locus of early language acquisition.
Quine goes on to argue that an appreciation of the distinction between
count and mass terms is most important in the context of these ontological
matters. Mass terms, such as “red” or “water,” function differently from the
count terms “car,” “chair,” and “dachshund.” Although “red” and “water” do
apply to bounded areas of material stuff, the terms do not individuate or set
the boundaries of the stuff to which they apply. The terms do not themselves
divide into units the areas of the environment they refer to or denote. “Red”
applies to parts and sums of parts of a red car, as well as to the car’s exterior sur-
face as a whole. In contrast, “car” denotes only individual cars, not their parts
or sums. This distinction between count and mass terms, however, depends
on use. Quine notes, for example, that the word “lamb” in “Mary had a little
lamb,” functions as a count term on the usual reading, but as a mass term
when reporting Mary’s dinner.
Similarly, “body” may have both count and mass uses, exhibited perhaps in
the distinction between “more bodies” and “more body.” As a count term
“body” individuates; it sets boundaries as to where one body leaves off and
another begins. As a mass term it only attributes a property of “body-ness” to
regions otherwise picked out or delineated.
Not surprising, count concepts are the ones needed for counting. All count-
ing presupposes a unit to be counted, and for this it is necessary to divide
reference. Mass terms do not provide units, since they themselves do not in-
dividuate among the parts of space-time they describe. Count terms, though,
may denote “spurious” objects as well as “real” ones. “Left half of a radio vol-
ume control knob,” “fender dent,” and terms denoting gerrymandered spatial
or temporal parts of chairs and dachshunds also divide reference, yet the items
they pick out do not meet intuitive criteria for being a “body.” Nevertheless,
there is no problem in principle counting the spurious as well as the real.6
Quine realizes that his own rough and ready characterization of “body” is
vague and its employment context sensitive. His characterization leaves con-
siderable room for differences in interpretation and application, and provides
no theoretical grounds for settling many of the problem cases earlier canvassed.
But for his own purposes, Quine sees little reason to formulate a very precise
definition. It is enough for him that cars, chairs, and dogs are representative
examples of our untutored, everyday notion of an object. They serve as un-
contested touchstones for what in the end is Quine’s challenge to the very idea
of there being any such thing as a referentially fixed, determinate ontology.
Perceiving Bodies Per se
Suppose, for ease of exposition we identify the psychological notion of an

“object” with Quine’s idea of a “body.” Nothing much hinges on adopting his
characterization versus most of the others found in the vision or develop-
mental literature. Suppose, too, we suspend worries about the linguistic focus
of Quine’s project, and consider what it might mean or entail for a linguisti-
cally competent subject to perceive bodies. More does hinge on this assump-
tion, but in order not to get bogged down, I wish to bracket my discussion
here from the usual debates over the relationship between language and con-
cepts. For me the linguistic focus serves to concretize the issues in more man-
ageable terms. Those who cannot abide the approach may substitute their
favored view of concepts or categories or internal representations, where I
talk of words and what they denote. (Further comfort may also be found in
the next section, “Object Perception Redux.”)
With language-speaking organisms, applying or perhaps being disposed to
apply correctly a “body”-denoting term might seem the simplest and most
obvious test of body perception. So formulated, however, the criterion is both
ambiguous and problematic. First, the term “body” has both mass and count
uses; the latter individuates, the former does not. It does not divide the world
into entities. Second, words like “car,” “chair,” and “dachshund” apply to bod-
ies as much as the word “body” itself does. Moreover, we may know or know
how to use these words without either having learned the word “body” or
having any other term available meant to acknowledge that a given item falls
into a category whose membership includes all and only bodies per se.
These considerations, in turn, raise questions about the role any explicit
representation of something’s being a “body” might play in object percep-
tion. After all, the count concept “body” is just one way to label or describe re-
gions of space-time, and the need for and specific function of it remains
unclear. Seldom, for example, does the task at hand require determining if or
how many bodies per se are in the offing. More usually the task at hand is to
perceive the kinds and properties of the bodies present. We need to know if
what is in front of us is edible, sit-able, lift-able, weight bearing, alive, prey,
predator, car, chair, or dachshund, not if it is a body per se.
Perhaps if the notion of a “body” actually plays a significant role in object
perception, it is because such a concept is implicitly, rather than explicitly, in-
volved in determining the kinds and properties of things in our environment.
Perceiving cars, predators, edibles, and sit-ables must somehow require or
reflect an appreciation of them as bodies. But how is this claim to be under-
stood empirically? What does it means for the visual system to implicitly take
something as a body?
Various of the problems explored above repeat themselves. Cars, predators,
some edibles and some sit-ables are instances of bodies. It follows, then, that
in perceiving them as such the visual system “marks out the boundaries” of
whatever spatial regions are so described. This is all pretty tautologous. It says
little more than that the processes of the visual system enable perceivers to
discriminate those regions of the material world that contain cars, predators,
edibles, and sit-ables from those that do not. What does not seem to follow is
that in order for the visual system to make these discriminations it must first
determine, represent, or otherwise render the information that there is an in-
stance of the (count) property “body” present. And surely it does not follow
that responding differentially to cars, predators, edibles and sit-ables entails
an appreciation of the fact that the regions so discerned share membership in
the class of bounded items that exhibit continuity of displacement, continu-
ity of visual distortion, and continuity of discoloration.
Object Perception Redux
To some it may seem that I have changed, misunderstood, or avoided the is-
sue of object perception as they conceive it. For them, the task of object per-
ception concerns the visual system’s encoding space time regions (STR’s) as
bodies and creating/assigning various descriptions or descriptors to them.
Now granted that the level of analysis may be different, I think the issues
raised above more or less carry over to this task specification as well. To ap-
preciate this, make the following terminological replacements:
“encodes STR x as a body” for “perceives that the STR x is a body”
“assigns to STR x the description # or a # descriptor” for “perceives that the
STR x is or is a #”.
Nothing here assumes that descriptions or descriptors are previously es-
tablished categories or that they are restricted to basic level shape categories
or that segmentation cues only operate for familiar shapes. Nor does it pre-
clude that the encodings and assignings are the work of relatively autonomous
perceptual mechanisms. Also note that “functional” property descriptors such
as weight-bearing, sit-able, and (in)edible and various “non-functional” prop-
erty descriptors such as size, shape, texture, color, and composition are ap-
plicable to both bodies and non-bodies (e.g. shadows, fender dents, and
driveways). And, as mentioned earlier, whatever analysis of the layout the vi-
sual system makes available does enable us to describe verbally and respond
appropriately to “non” and “spurious” objects along with the “real.”
That the visual system provides or affords information that guides the way
we navigate, act, and react to the environment goes without saying. The is-
sue, rather, is to understand better the sense in which the visual system must
encode regions as bodies in order to do so. If in the end all the claim amounts
to is that descriptors and descriptions are applied to regions of material space,
there is little to debate. There does, of course, remain much to find out and de-
bate about how the visual system actually accomplishes these tasks.
Developmental Considerations
From a developmental perspective, it may seem there is more reason to sup-

pose that some body-representation per se may be employed in perception
and/or conception. After all, newborns do not divide the world into cars,
chairs, and dachshunds, and their appreciation of which things are edible,
sit-able, lift-able, and alive may not amount to much either. So it would not
be surprising if infants start out lumping all these different types of things
into one big diffuse category, that of “bodies” (Shipley & Shepperson 1990).
But again, there are problems understanding the exact content of such a
claim. For example, it is generally agreed that quite early on in life infants can
separate figure from ground and can distinguish spatially continuous bits of
matter from disconnected pieces of stuff. They also respond differently to
portions of their environment that move together and those that do not. And
their appreciation of the layout, such as it is, can guide their activities. Does
this mean, though, that they have and make use of a label (concept or repre-
sentation) that serves to connote or denote all the things that are bodies or
have the property “body?” If not, does it at least mean that their visual system
makes implicit use of such a representation in the course of providing the in-
fant information about the environment? Again, and for reasons similar to
those canvassed in the previous section, I am not convinced a positive answer
to either of these question is logically or empirically required.
Now some theorists seem content to let the evidence speak for itself. They
are willing simply to call instances of figure/ground discrimination, Gestalt
grouping phenomena, or perceptual tracking activities, whether by infants or
adults, instances of or proof of object perception. And I have no qualms with
this practice, as long as the nature and extent of the claims are kept in mind.
For other theorists, perceptual discrimination, grouping, and tracking, are
not taken to be sufficient for the attribution of object perception. The infant or
adult must in some sense be cognizant of or represent the space-time regions
isolated, grouped, or tracked as bodies. For them finding a satisfactory un-
derstanding or characterization of this richer demand remains an issue.
Past and Future Contingencies
Bodies have both spatial and temporal dimensions. Cars, dachshunds, edibles,
and sit-ables not only occupy areas of space, they also have settled pasts and
futures rife with threats and promises. Discriminating among regions of space
that are so described, however, neither requires nor presupposes having
knowledge of such life histories and prospects. It is one thing to be able to per-
ceive correctly a wide variety of cars, dachshunds, edibles, and sit-ables under
ideal and less than ideal conditions. It is another to have perceptual con-
stancy, to appreciate the sameness of particular shapes, sizes, textures, and
colors when viewed under variable lighting and from different angles and
distances. It is another to be able to perceive these shapes, sizes, textures, and
colors when parts of the regions are occluded from view. And it is another still
to have a firm grip or conviction about how things will be and look at much
later dates.
Cars, dachshunds, edibles, sit-ables, indeed material substances in general,
change in shape, size, color, coherence, and consistency as they age and in-
teract with the world. Some of the changes can be reliably predicted, many
cannot, and the best scientifically sanctioned predictions will not always
turn out as expected. Given the vagaries of life histories, we are thus much
more likely to be accurate about how a currently observable temporal slice of
our environment might look under certain different viewing conditions than
about how future temporal slices will appear. Nonetheless, perhaps the most
basic, general, and reliable prediction we make about our environment is that
things do not change or go out of existence without cause. In addition, we as-
sume that neither mere spatial displacement nor our observing and failing to
observe the world are causes of physical change or annihilation.
Appreciation of persistence over time, independent of displacement and
observation is at times referred to as “object permanence.” The term can be
somewhat misleading in that regions to which mass terms apply (e.g. red or
water containing places), “non” objects (e.g. shadows, fender dents, or drive-
ways) and “spurious” objects, likewise, do not change or go out of existence
without cause. And they, too, are presumed by us to carry on their lives inde-
pendent of our observation. So there is nothing special about the domain
of bodies or “real” objects on this score. Undeniably, temporal slices of real,
non, and spurious objects do go out of existence when their time is up and
they are no longer observed, but this is by definition not by cause. Be that as
it may, an appreciation of such persistence over time is what many theorists
mean by perceiving or conceiving of the world as composed of objects.
The Object Concept
Piaget argued that an infant’s conception of reality is much different from our
own. The newborn does not distinguish the world of experience from experi-
ence of the world. Conception of a world with enduring material objects
existing independently of oneself comes later and requires construction.
In addition, Piaget claimed that an infant’s concept of reality is initially
constructed in terms of his or her own actions and the immediate environ-
mental effects or reactions they precipitate.
To support these contentions Piaget devised a variety of ingenious experi-
ments intended to show that infants’ responses to spatial and temporal trans-
formations are not at all like those of older children and adults. Initially, Piaget
contends, babies do not expect objects to persist over time and place. Hence,
they do not, as we do, search persistently for hidden objects, nor do they have
the same expectations about what happens when things move behind and
emerge from occlusions. For newborns, out of sight is not only out of mind,
it is out of existence. Or put more accurately, newborns do not have a sub-
stantial conception of existence in and for which these distinctions make
good sense. Eventually infants do begin to search intentionally for missing
objects, but the searches are guided more by past patterns of interaction than
by the available evidence. Infants expect to find an object at the place they
found it before, rather than where they have just observed it being placed.
Piaget’s pioneering work and theories set the stage for much contemporary
discussion of the development of object perception and conception. A spate
of recent experiments claim to demonstrate that Piaget may have under-
estimated young babies’ prowess. Infants, it is maintained, do seem surprised
when things hidden behind a screen are not there when the obstructing
screen is removed. They seem to share with us some biases about the paths
moving things will take, and they have some expectations about the full con-
tours of simple shapes whose parts are occluded from view. In addition, their
searches are not guided solely by past success but may take into account new
conflicting evidence.
Now I have no desire to criticize these experiments, although issues of de-
sign and data have been raised (Haith and Benson 1998). My concern is how
best to understand their implications and import. Earlier I raised questions
about the proper interpretation and role a notion of a body per se might play
in conceptual activities or in the encoding activities of the visual system.
These questions and qualms, of course, do not preclude our having expecta-
tions. And I am willing to accept that the recent experimental evidence sug-
gests infants may have richer sets of expectations at an earlier age than many,
including Piaget, may have thought. Less clear is what these findings say
about the perception and conception of objects.
In The Child’s Conception of Reality, for example, Piaget allows that infants
may have crude expectations of constancy, occlusion completion, and per-
sistence that they use to accommodate their experiences. They may briefly
search for the hidden, be surprised when something disappears without
cause, and have wired-in visual pursuit schema for tracking movement. What
Piaget denies is that these expectations and perceptual strategies extend
much beyond the here and now. According to Piaget, infants do not have cog-
nitively useful representations of the structures and patterns of events in the
environment that enable them to place items in our ordinary “scheme of
things”—a stable world with its own independent past and future. But Piaget
argues, an appreciation of permanence and persistence restricted largely to
the here and now is not sufficient for the attribution of the object concept.
A more enduring spatial/temporal framework is required (See Sugarman
1987). Is Piaget correct, though, about the real nature of objecthood, and are
his more demanding criteria for attributing object perception and concep-
tion warranted?
Knowing how things can or will behave in and out of our presence makes
up a large part of what we each know about the world. Some of this knowl-
edge may be genetically inherited, some is readily attainable and common-
place (e.g. dropped objects tend to fall), most comes only with a good deal of
experience or learning, and much remains exclusively within the purview of
scientists or other experts (e.g. an accurate theoretical conception of space
and time). Moreover, there is no plausible bound on what there is to know
(what correct expectations we may have) about the possible or actual behav-
ior of the animate or inanimate world and the events that can or will take
place. It goes without saying that an infant’s conception and understanding
of the world is different from and impoverished compared to our own.
When, though, in the course of this development do infants first appreciate
a world of objects? At what age or stage does a child first perceive or conceives
of things as bodies? The analysis found in these pages suggests that this ques-
tion may not be clear enough to answer or answer univocally. For neither
everyday practice nor current psychological theory seems in a position to
sanction a single privileged way to understand the claim. Furthermore, I am
not sure what is at stake in settling on one. Is there, for example, a substantive
difference between the claim that at a particular age infants do not perceive
and conceive of objects and the alternative claim that infants do have an ap-
preciation of objecthood, only their expectations and biases about the course
of events are quite different from our own? But surely if infants’ expectations
and biases (or lack thereof) are radically different from our own, they can not
be said to have our concept of an object. But what specifically is “our” concept
of an object, and what role does it play in perception and conception?
Identity of Kind and Strict Identity
Some of the controversy over object perception and conception is, I think,
the result of conflating issues of constancy, permanence, and expectations
with claims of identity. To determine that various space-time regions are
or are segments of the “constantly” same/identical car, dachshund, edible, or
sit-able is distinct from being able to appreciate the constancy of their sizes,
shapes, textures, and even material compositions. Nor does it amount to hav-
ing expectations about how such regions will look from other vantage points
or when occlusions are removed. Judgments of identity require a determina-
tion of where a particular car, dachshund, edible or sit-able starts up and
where it ends off. Identity involves linking segments, not merely describing
them. It is a judgment that a space-time region here and there, before and
now, go together in ways appropriate to sustain a claim that it is the very same
car, dachshund, edible, or sit-able with which we are dealing.
Identity judgments of this sort often do assume sameness of body or bodily
stuff in that in most contexts spatial-temporal regions are usually not said to
constitute segments of the very same, car, dachshund, edible or sit-able un-
less their material makeup traces a more or less continuous path. But obvi-
ously the reverse does not hold. A set of space-time regions may continue as
the same body or bodily stuff but lose its kind-identity. The same body is no
longer identifiable as a car when compressed into a lump of metal at the junk
yard. And even if this lump is then reconstituted as a car, the resulting vehicle
is unlikely to be considered the identical car as its pre-crushed embodiment.
Similarly, being shown the pre and post crushed cars, but unaware of their his-
tory of transition, one may readily declare each such space-time regions to
the bodies, i.e. segregated, bounded matter, perhaps of a particular size, shape,
texture and composition. Yet one may have no idea that these different look-
ing manifestations are actually segments of a single continuous lump of metal.
And confuted expectations of persistence of size, shape, texture and color
may be a main reason for the mistake.
Thus judgments of identity run deep. Appreciation of change whether ex-
pected or unexpected, entails neither a claim of identity nor one of non-
identity. A cake cut into slices can for some purposes be considered the same
cake, although the transformation into segments may not have been observed
and the resulting spatial array unanticipated. In contrast, an identically look-
ing intact substitute confection is not the same cake, although there may be
no visually apparent differences to be discerned. Surprise at finding many
pieces of cake when a screen is lifted, is compatible with judging the now
non-contiguous pieces to constitute a stage in the life of the one cake hidden.
The space-time regions before and after hiding count as segments of the same
cake, relative to one way the term “cake” may be wielded to divide reference.
The situation is similar with the concept “body.” Surprise at finding a dis-
tribution of matter not of the shape, color, or cohesion expected is compat-
ible with a judgment of the identity of the constituting materials. Likewise,
failure to notice any difference in bodily appearance between space-time re-
gions is compatible with a denial that the regions so observed are parts of one
and the same body. Body-identity is to be understood in terms of an evalua-
tion of identity over space and time with respect to some particular individu-
ating notion of a “body.”
Identity, then, is a more abstract notion than phenomenal or physical in-
distinguishability. And for the most part, we get along on vague if plausible
intuitions of sameness or difference of identity adjusted to context, salience,
and need. If pressed for something firmer or fixed, we usually soon find out
we have great difficulty coming up with criteria of identity in anything that
approaches necessary and sufficient conditions. For example, is the car at
hand, the same old car totally refurbished, or is what exists a new car, given
that all the original parts have been changed? And would or should it matter
if the (re)building took place in a day, not over a decade? Alternatively, might
it even make sense to think that the pre and post crushed cars previously
alluded to are really temporal slices of one “transformed” car, since all the
materials are the same? Centuries of philosophical puzzles about personal
identity, the identity of a ship completely rebuilt one plank at a time, the
metamorphosis of butterflies, and a mind-boggling array of cases of object
fission or fusion serve as further warning of the problems to be faced.
Another source of confusion in discussions of object perception and con-
ception is the failure to keep in mind a distinction between two different
kinds of identity judgments. The claim that a space/time region a and an-
other space/time region b belong to one and the same car, dachshund, edible,
sit-able, or body per se differs from the claim that a = b. The former says that
a and b are parts of the same whole, relative to some way of individuating
which wholes are to count. The latter says that the space/time region picked
out by a and that picked out by b is the very same one. Thus, consider the
much cited identity: the Evening Star = the Morning Star = Venus. This iden-
tity is not to be understood as a claim that certain evening spatial/temporal
slices of the heavens and certain morning spatial/temporal slices are parts of
the planet Venus. Instead, the claim is that the entity picked out by each of
the three expressions is the exact same totality. Our use of “star” and “planet”
to individuate and divide reference may play a role in fixing the reference of
these labels, but the identity itself is not relative to either concept. Numerical
identity is not identity with respect to an individuating category. In general,
x = y, if and only, the objects referred to by the names, variables, or other sin-
gular terms are identical.
Neither part/whole nor numerical identity, however, simply inhere in Na-
ture and the course of events. Quine, indeed, questions wherein the empiri-
cal content of identity claims is to be found other than in our use of general
terms to divide reference and singular terms to name. For Quine, reification
or commitment to a world of objects amounts to no more. It also demands no
less, since what makes an entity the entity it is, is its identity. The linguistic fo-
cus of Quine’s account of ontology and ontological commitment lies in this
understanding of identity and reification.
Quine’s more radical and controversial ideas lie elsewhere. They have to do
with his views about language and about how language hooks up with the
world. Quine maintains that there are incompatible ways to assign meanings
and denotations to the terms of our language and no fact of the matter as to
which among a set of observationally adequate assignments is correct. There-
fore, ontology and attributions of ontology are themselves parochial, relative
to the scheme of translation adopted (Quine 1960, 1969). Now this is no place
to explicate, let alone defend, Quine’s theses of indeterminacy of translation,
the inscrutability of reference, and the implications they both have for his
doctrine of ontological relativity. Suffice it to say Quine’s ontological notion
of perceiving and conceiving of objects is more abstract and linguistically fo-
cused than that of most psychologists, including Piaget.
Conclusion
My goal in this paper has not been polemical. I have attempted to sort out a
number of theoretical issues central to discussions of the perception and con-
ception of objects. As indicated, I believe many of the difficulties result from

unclarities in the questions asked. Much of the controversy, too, lies in the
fact that, matters of clarity aside, quite different questions are being asked.
Hence quite different theoretical and empirical problems are raised. Not sur-
prising, the answers offered are diverse and often incommensurable. At one
extreme, all some theorists seem to mean by “taking something to be an ob-
ject” is that the organism responds differentially to certain discrete pieces of
the environment. For minimalist claims of this sort, the mere phenomena of
figure/ground differentiation, occlusion completion, or perceptual tracking
may fit the bill.
At the other end of the spectrum are the ontological issues that have been
of philosophical concern since ancient times. Here the questions are more
metaphysical, centering on accounts of identity and reification. And the
answers offered have ranged from pinning entity-hood on some notion of
“substance,” the pure stuff in which the “essence” of individual things in-
here, to accounts, like Quine’s, that abjure the whole substance/essence
framework. For Quine, science and other empirical study, not metaphysics,
informs us about what there is and what is identical with what. Ontological
commitment is reflected in how we group, categorize, and talk about our
world. In the case of talk, reification shows itself primarily in the distinc-
tions language draws between such statements as (i) “Something is red and
something is a car” versus “Something is a red car,” and (ii) “That (a) is a
car and that (b) is a car” versus “That and that are segments of the same car,”
and (iii) “That segment a and that segment b are segments of one car” versus
“a is identical to b.” These distinctions get reflected in symbolic logic nota-
tion as: (i) “(Ex)(Ey)(Rx & Cy)” versus “(Ex)(Rx & Cx),” and (ii) “(Ca & Cb)” ver-
sus “(Ex)(Cx & Sa,b,x),” and (iii) “(Ex)(Cx & Sa,b,x)” versus “a = b.” Thus for
Quine, ontological commitment is keyed to the use of variables, names, and
other singular referring expressions.7
Although my goal in this paper has been expository not polemical, I think
there are some issues the exposition does serve to highlight. Among these are:
1. Questions of constancy, occlusion completion, permanence, and identity

are not peculiar to the perception of bodies or the properties of bodies per se.
One can raise, I think with some profit, the same questions about non or spu-
rious objects.
2. Mapping the course of development of vison and cognition from birth
thereon has intrinsic interest. Attempting to determine when the concept
of an “object” makes its first appearance, founders on the fact that there is no
unique object concept sanctioned either by ordinary use or present scien-
tific theory.
3. Differential responses and manifestations of expectations met or frus-
trated are important tools for studying perception and conception. Nothing
said in this paper is meant to decry or challenge their usefulness. But they can
only take us so far. When it comes to richer, more abstract notions of “object,”
“identity,” and “reification,” whether those of Piaget, Quine, or those cham-
pioned by other theorists, they may not be able to take us far enough.
Acknowledgments
I wish to thank Sidney Morgenbesser, David Rosenthal, and the editors of this
volume for comments and helpful criticism.
Notes
1. Richards (1988), p. 17.
2. Related problems are involved in attempts to specify formally the notion “object
part.” It is not possible in this paper to discuss explicitly the complications this issue
raises. For a non-technical account of the idea of “object part” in theories of vision, see
Hoffman (1998).
3. Whether holes, perforations, rainbows, clouds, molecules, or atoms are allowed in

will depend on how one understands the agreed on restriction to spatial objects and to
unresolved, if not unresolvable, issues about how and where to draw the line between
the observable and the non-observable.
4. In fact, in one sense of the word “see,” at any given moment we can only see a tem-
poral slice of an object.
5. In some of the literature the supposed real objects are said to be “units” or “things”
or “wholes.” Whatever the difference in terminology, the problems to be considered
remain much the same.
6. The notions “object files” and “object file counters” have gained some prominence
in recent work in perception and cognition. (See, for example, Scholl and Leslie 1999)
Space limitations prevent my giving this work the specific, in-depth treatment it de-
serves. It us enough to note that this approach does not abnegate the need for schemes
to divide reference, rather it is to be understood as a proposal about what the scheme
and units may be in some cases. There is a vast and growing body of research purport-
ing to show that very young infants can count. Elsewhere, I have expressed reser-
vations about the claim that these studies demonstrate that the concept of number
is innate (Schwartz 1995). Accumulating evidence also seems to indicate that much of
the experimental data on infant “number” behavior may be explained in terms of in-
fants having an appreciation of amounts (e.g. area or volume) rather than an apprecia-
tion of cardinality (Mix et al., 2002). This is significant for our concerns in that such
judgments of sameness and difference of amounts may presuppose only a rudimentary
mastery of mass terms or concepts rather than a need for count categories to divide ref-
erence. (Schwartz 1999). In any case, it should be clear that full-fledged counting,
whether counting cars, chairs, and dachshunds, or simply bodies (i.e. objects) does re-
quire count labels or concepts to provide units.
7. My sympathies lie with Quine in rejecting substance/essence metaphysics in either

its old or newer guises. At the same time, I think it important in the study of perception
and cognition to be less language-oriented and to consider the role other forms of
symbolization may play in informing thought and guiding behavior. In this, along
with unease about Quine’s privileging physics, I am more at home with Nelson Good-
man’s (1968 and 1978) constructivist views (Schwartz 1996, 2000).
References
Goodman, N. (1968). Languages of Art. Indianapolis: Hackett Publishing.
Goodman, N. (1978). Ways of Worldmaking. Indianapolis: Hackett Publishing.
Haith, M. and J. Benson, (1998). “Infant cognition.” In W. Damon (ed.). Handbook of

Child Psychology 5th Edition. New York: John Wiley.
Hoffman, D. (1998). Visual Intelligence. New York: W. W. Norton.
Mix, K., J. Huttenlocher, and S. Levine, (2002). Quantitative Development in Infancy and
Early Childhood. Oxford: Oxford University Press.
Piaget, J. (1954). The Construction of Reality in The Child. New York: Ballantine Books.
Quine, W. V. (1953). From a Logical Point of View. Cambridge: Harvard University Press.
———. (1960). Word and Object. New York: John Wiley.
———. (1969). Ontological Relativity and other Essays. New York: Columbia Univer-
sity Press.
———. (1973). The Roots of Reference. La Salle, IL: Open Court.
Richards, W. (1988). “Image interpretation: Information at contours.” In W. Richards

(ed.), Natural Computation. Cambridge: MIT Press.
Scholl, B. and A. M. Leslie (1999). “Explaining the infant’s object concept: Beyond the
perception/cognition dichotomy.” In E. Lepore and Z. Pylyshyn (editors), What is Cog-
nitive Science? Oxford: Blackwell.
Schwartz, R. (1995). “Is mathematical competence innate?” Philosophy of Science 62,

227–240.
———. (1999). “Counts, amounts and quantities,” paper presented at Society for Re-
search and Child Development. Albuquerque, New Mexico.
———. (1996) “Symbols and thought.” Synthese 106, 399–407.
———. (2000) “Starting from scratch: Making worlds.” Erkenntnis, 151–159.
Shipley, E. and B. Shepperson (1990). “Countable entities: Developmental changes.”

Cognition 34, 109–136.
Sugarman, S. (1987). Piaget’s Construction of the Child’s Reality. Cambridge: Cambridge

University Press.
Prescript 13
This essay was published with a preface and with commentaries by Alan
Gilchrist, Paul Whittle, and Richard Brown. They and I were members of a
project on perception organized and underwritten by the Center for Interdis-
ciplinary Research (ZiF) at the University of Bielefeld. The preface describes
the origins of the work and provides context for its particular focus and line
of argument. The underlying issues and debates come up over and over again,
in other articles and commentaries in the volume from which chapter 13
is taken. (See especially R. Mausfeld, “The Dual Coding of Colour” and re-
sponses in R. Mausfeld and D. Heyer (eds.) (2003) Colour Perception: Connect-
ing the Mind to the World. Oxford: Oxford University, 381–486.)
13 Avoiding Errors About Error
Preface
This study began in collaboration with Alan Gilchrist. Alan was working on a
book on lightness perception. He was developing a new model, one based, in
no small part, on a notion of “error.” Alan’s project, however, met resistance
from various visual scientists in the ZiF group. A major reason was their un-
willingness to countenance Alan’s appeal to error. Indeed, many maintained
there could be no such thing as error, at least not when it came to perceiving
color. On the face of it, this criticism was puzzling. No one doubted, for ex-
ample, that on occasion we mistakenly put on socks that do not match. More-
over, often those who recoiled at the notion of error were content to talk
about vision being “veridical.” In an effort to clarify issues, Alan and I decided
to write a joint paper on error. We would spell out a sound psychophysical
concept of error and untangle assorted confusions plaguing the group’s dis-
cussions. Our collaboration began with my proposing alternative ways to
specify a precise notion of error and Alan challenging the suitability of my
formulations. In the end, none of the options I offered met with Alan’s ap-
proval, and our joint enterprise was abandoned. I, then, pursued the topic
on my own.
My aim was neither to put forth nor defend any particular account of er-
ror. Instead, I wished to delineate the space of options available and char-
acterize, in a very general way, the advantages and difficulties facing each.
I came to believe, in fact, that there was room in the study of both achro-
matic and chromatic color for alternative accounts of error, each perhaps
useful in different contexts and for different tasks. I became convinced, how-
ever, that my proposed rapprochement was being thwarted by unexpressed
metaphysical/ontological assumptions that both sides were bringing to the

table. So what started for me as a technical problem in psychophysics led
back to longstanding controversies in philosophy about the nature or essence
of color.
At the heart of many of these older philosophical debates, and most of the
current ones (Byrne and Hilbert 1997), is the goal of finding out what colors
really are. Settling this issue is thought to have important implications. With-
out an idea of what colors really are, we do not know what it means for color
experiences and judgments to hook up to the world, to correspond to reality.
At the same time, only with respect to a standard or norm of correctness does
the idea of error itself make clear sense.
The following study of error in achromatic color perception casts doubt on
the very idea of a unique essence for color. There are different ways to get
things wrong, along with alternative conceptions of what it is to get things
right. I also see no substantive grounds for assuming that any one, or only
one, of these conceptions specifies what colors really are.
Introduction
That we make errors in perception seems all too obvious. Less obvious is that
we are often mistaken about the nature of perceptual error. A major reason
for this latter confusion is failure to pay proper attention to the fact that er-
ror is a relative matter—relative to an understanding or specification of
what it is to get things right. Independent of a standard of correctness,
claims of error are otiose. This chapter focuses on accounts of error in the per-
ception of achromatic colors, that is, the perception of white, black, and the
grays. These “colors” are said to lack hue; they constitute what is known as
the “gray-scale.”1
As investigation will show, the idea of perceptual error is often understood
in different and conflicting ways, and there is no reason to assume that one
account is privileged. Moreover, there is reason to treat various purported
cases of error not as error, but as discordances among competing ways of or-
ganizing and ordering our world. Until near the end of this chapter, such
qualms will be kept in abeyance. If along the way use of the term “error” jars
intuitions, consider it a technical term of service in psychophysics. This may
not be far from the position it is best to adopt, in any case.
Avoiding Errors About Error 213
Terminology
Not all light striking the surface of an object is reflected. Black surfaces reflect
very little, white surfaces almost all, and gray ones, varying amounts in be-
tween. The ratio of reflected light to the incident light is called “reflectance.”
“Lightness” and “lightness perception” are the terms used to talk about the
experiential correlates of surface reflectance, our experience of the gray scale.
Lightness constancy is the ability to perceive a surface has the same lightness
when viewed under different conditions. (For technical details, see Wyszecki
and Stiles 1982, and the glossary of Gilchrist 1994.)
Anyone perusing an introductory psychology text will probably run into a
demonstration of a popular illusion in achromatic colour perception. This
“simultaneous contrast illusion,” as it is called, is easy to duplicate on one’s
own. Take two small squares of paper of the exact same shade of gray, place
one on a black background and the other on a white background. Under these
conditions the squares do not look alike. The square on the black background
appears lighter than the one on the white surround. Thus our perception of
lightness is said to be in error. Lightness constancy fails. Two objects of physi-
cally identical material do not look the same; they do not match perceptually.
Matching tasks are the preferred method for studying errors in lightness con-
stancy. A standard paradigm is to have a subject select or adjust the reflectance
of a surface viewed in good light so as to match a given target surface. The tar-
get may be viewed in shadow, against a special background, or under some
other condition of experimental interest. The subject’s matching judgments
are then compared with the physical reflectance properties of the surfaces
(for details and variations on the paradigm, see Wyszecki and Stiles 1982).
To simplify discussion of the logic of these studies and the ideas of ‘error’
employed, it will be helpful to introduce some notational abbreviations:
(1) x, y, z . . . : are surfaces having uniform, physically defined reflectance

values, x, y, z . . . ;
(2) x = y: if and only if the reflectance values of the surfaces are the same;
(3) Ci . . . Cn: are viewing conditions (i.e. lighting, background, distance, and
angle of regard);
(4) Ci = Cj: if, and only if, the viewing conditions are the same;
(5) Cix: is the perceived lightness of a given surface of reflectance, x, under
viewing condition, Ci;
(6) Cix = Ciy: if, and only if, the subject judges the them to be the same or to
match perceptually.2
Reflectance errors
The most straightforward notion of error found in lightness constancy stud-

ies is specified with respect to reflectance. For example, Gilchrist et al. (1995)
offer this “precise definition of a lightness error: any difference between the
physical reflectance of the target surface and the physical reflectance of the
perceptually matching surface.” This sort of error will be called “R-error.”
Thus, S makes an R-error, if x ≠ y and S judges Cix = Cjy, or x = y and S judges
Cix ≠ Cjy. Notice that this definition of R-error is completely general; there are
no restrictions on the viewing conditions (see Gilchrist et al. 1999, p. 809).3
The conditions Ci and Cj may be the same or vastly different, and one or both
may be conditions no one would think reasonable for evaluating or compar-
ing lightness. They could be conditions in which lightness discrimination is
essentially absent. Also, Ci may be daylight with the target on a neutral gray,
while Cj is coloured light and the target resting on a glowing self-luminant
surface. In all cases, whether the viewing conditions are ideal or perverse,
alike or very dissimilar, S is mistaken if S judges x and y to match when they
differ in reflectance, or not to match when they are of the same reflectance.
It is possible to extend the notion of reflectance error to include aspects
of ordering. S could be asked to judge if x looks lighter than y. Thus, suppose
x > y, and S judges they do not match. S has not made an ordinary R-error. If,
however, S judges y is lighter than x, then S makes an ordering error with
respect to reflectance. One could attempt to push issues further by placing
richer demands on S’s evaluations. S might be asked to judge if x is twice the
lightness of y, or if the difference between x and y is equal that between y and
z. S might then be claimed to make errors if the judgments do not correspond
to the simple ratios or differences of the physical reflectances. Of course, many
will balk at considering such discrepancies perceptual error, since they are only
to be expected. It is a general feature of sensory systems that as intensities of
stimuli increase, differences in intensities are harder to discern. Achromatic
color perception is no exception. There is a compression of the scale of sub-
jective lightness experience as the intensity of the reflected light increases.
Decisions about the treatment of discriminatory thresholds and scale com-
pression, however, intrude at the very start, with the initial, austere notion
of R-error. Discrimination of reflectance is not perfect. No instrument, let

alone a human perceiver, can detect every physical difference in reflectance.
Still, one could hold firm and maintain any failure to discriminate between
two surfaces of different reflectance is an R-error. Another option is to de-
fine R-error in terms of a spread of reflectance values rather than a unique
point. On this account, failure to perceive a difference between x and y is not
an R-error, if the difference in reflectance is less than a specified threshold.
Again, in spelling out criteria for error there is some leeway. It will simplify
matters to assume for now that a satisfactory decision has been made.4
For our concerns, too, it will make things easier to limit consideration to
judgments of matching and not to worry about perceptual errors involving
judgments of order. The structure of matching-type errors has quite enough
complexity. For example, where x = y ≠ z, and S judges Cix ≠ Cjy, S makes an
R-error. Nevertheless, if S judges Cix ≠ Ckz and Cjy ≠ Ckz, S is free of R-error.
That Ci and Cj lead to R-errors in some cases is perfectly compatible with these
viewing conditions yielding accurate matching judgments in other compar-
ison tasks involving x and y. Or consider a set-up, Cl, involving colored or ul-
traviolet light. Discrimination between various targets under Cl may be quite
good; so there is no R-error. None the less, in this light the items may not look
very much like they do in normal daylight against a neutral background.
Other results, perhaps more in conflict with ideas about the nature of light-
ness constancy, also follow from the definition of R-error. Suppose x and y
differ in reflectance by a minuscule amount, well below any plausible dis-
crimination threshold. Put x on a black background, y on a white one, and
view them in daylight. We know, from contrast illusion studies, x will appear
lighter than y. Therefore, they will be discriminated, and there is no R-error.
S’s judgment is not only correct, it is more accurate than when the targets are
both viewed against an ideal neutral background. Finally, we have no hesita-
tion claiming S makes an R-error, if x = y and S judges Cix ≠ Cjy. Less appealing
is the result S gets things right, makes no R-error, if S judges these same x and
y match when the illumination is too poor to discriminate most differences
in reflectance.
Errors of look
To claim that S perceives matters correctly, especially in the latter cases,

will strike many as perverse. The fact S does not make an R-error in such
circumstances seems to point to a flaw in this conception of perceptual error.

Surely, S does not see things properly in the contrast illusion situation or
when the viewing conditions are so deficient that almost everything appears
to have the same lightness. Under illusion provoking or impoverished condi-
tions, although certain matching judgments do jibe with the comparative
reflectance values of the surfaces, the perceptual experiences are not right.
The targets do not look the way they “really” are. Such purported failures of
perception will be called “look-errors,” or L-errors for short.
Intuitions related to L-error underlie various discussions of lightness per-
ception. In particular, it is often thought important, and makes good sense, to
determine which of two perceptual experiences is responsible for an R-error.
In a variety of studies, subjects are shown a target, x, under an experimen-
tal condition of interest, Ci. They are then presented a chart of achromatic
chips from a Munsell (1976) book of colours and asked to choose a chip that
matches x.5 The Munsell chips are presented, not under the experimental
condition but under a condition thought particularly conducive to light-
ness discrimination. This condition, call it CM, is spelled out precisely in the
Munsell book. It includes a specific white illuminant, a specific medium gray
background, etc.
If, in such a test situation, S chooses Munsell chip y, and x ≠ y, S makes an
R-error. There is, however, the tendency to think that the source of the error
can be pinned on the perception of x under Ci. Cix, it is claimed, is not the
right or correct look of x. Cix is an instance of an L-error, and this L-error is
used to explain the R-error. The faulty Cix misleads S to choose a chip, from
the Munsell chart, whose reflectance differs from x.
A comparable distinction between L-error perceptions and those free of
L-error shows up elsewhere in lightness constancy discussions. It is common-
place to be told that certain viewing conditions prevent subjects from seeing
things with their true colour. “Failures of lightness constancy that occur in
the presence of different levels of illumination take a fundamental form. Sur-
faces in the brightly illuminated regions tend to appear lighter gray than they
really are and surfaces in shadowed regions tend to appear darker gray than
they really are” (Gilchrist et al. 1995).6 True, if Ci and Cj are alike, except that
the illumination in Ci is higher, Cix will look lighter than Cjx. This, though, is
a fact about comparative appearances and says nothing about the looks of
surfaces being as they really are (see Gilchrist et al. 1999, p. 811). Similarly, in
everyday conversation it is assumed that things do not look the way they re-
ally are when the lighting is very dim.
Although backed in this way by intuitions, the idea that achromatic colors
sometimes appear right, and at other times wrong, needs careful explication.
As with all notions of error, to make sense of L-error we must specify an ap-
propriate standard of correctness. With respect to what is an appearance to be
judged incorrect? How are we to understand the claim that something does or
does not appear with its appropriate lightness? What is it for an object to look
to have its true value, to be perceived as it should be? Until these questions are
answered, common intuitions about errors of look lack firm foundations.
One obvious way to settle such matters is to specify that the correct or
“right” look for a surface is the way it appears when viewed under some ideal
condition, CI. There is L-error, then, whenever a target surface looks different
from how it does in this special set-up. Cix looks right, if x = y and Cix = CIy.
Alternatively, Cix is an L-error, if x = y and Cix ≠ CIy.
This account of L-error can be used to support those intuitions and distinc-
tions not handled within the conceptual confines of R-error. Suppose, for
example, the assumed ideal viewing condition is the one specified in the
Munsell book, that is, CM = CI. Perception of a surface under this condition
defines its correct look.7 Previously, when x ≠ y and Cix = CMy, there was no
established basis for assigning blame for the R-error. Now, relative to the
choice of CM as standard, there is a justification for pinning the mistake on
one appearance rather than the other. Cix is an L-error.
Choosing a standard also gives purchase on cases where neither of the
samples is under the ideal condition. If x = y, Cix ≠ Cjy and neither Ci nor Cj
are CI, it still is possible to pin the error on one of the perceptions. L-error lies
with the appearance that fails to match the perception of its target reflectance
under CI. If both Cix and Cjy fail to match the perception of the given re-
flectance value under CI, then there is an L-error in both, and the R-error is
due to each.
Intuitions about the true look of a particular target reflectance are given
similar treatment. A surface in shadow does not appear as it should, since its
appearance does not match the way it looks under CI. A target in very bright
light appears lighter than it really is, since it appears lighter than it would
in CI. Or what amounts to the same thing, it matches a target of higher re-
flectance viewed under CI. The accidental success in reflectance judgments
in illusory contrast conditions and in extremely poor illumination can also be

explained. Although S’s matching judgments are not R-errors, the targets do
not have the correct look that goes with their reflectance values. It is an acci-
dent that S makes no R-error, since the perceptions the matching judgments
rely on are themselves L-errors.
Some complications
It is important to keep in mind in stipulating, say, CM as standard, that it is

only the Munsell viewing condition that is being privileged. The definitions
of “correct look” and “L-error” are in no way constrained by the selection of
chips and their associated ordering in the Munsell book. Any target of any re-
flectance can be assigned its correct look relative to the chosen CI. Also, the
experimenter is given no more accurate information about how x looks to S
when the match involves a Munsell chip under CM, than when the matching
judgments S makes do not involve Munsell chips or conditions. Nor can it be
assumed when S judges Cjx = CMy, S is assigning the particular reflectance
value of the Munsell chip y to the target x, rather than assigning the value of
x to the Munsell chip y. By themselves, the definitions do not sanction these
additional claims.8
Nothing said so far challenges the idea that a standard viewing condition,
such as CM, can be chosen, and the look targets have, under this condition,
deemed the right one. Still, setting a standard of correctness in terms of a
designated CI leaves important issues to be resolved. I begin with a problem
that might require technical finessing, although it may not be central to
an account of L-error. Suppose x = y = z, Cix = CIz, Cjy = CIz, but Cix ≠ Cjy.
By definition both Cix and Cjy look correct, there is no L-error in the way ei-
ther appears. Yet they do not match each other, so there is R-error. The pro-
posed link between L-error and R-error, therefore, breaks down. One solution
is to alter the definition of L-error. Another is to assume such matching judg-
ments will not occur frequently enough to bother with. For simplicity I make
this assumption.
A related difficulty cannot be dismissed as readily. When x > y > z, and the
differences straddle threshold borders, subjects will often report CIx = CIy,
CIy = CIz, and CIx ≠ CIz. Since the targets are always under the ideal condition,
they must always look correct. So, again, there is unexplained R-error. What’s
more, the lack of transitivity of matching, even under CI, puts strain on the
very idea a target has a singular, true look. An interesting, little explored,
approach to these kinds of puzzles is to distinguish perceptual matching
(our =) from perceptual identity. Matching is non-transitive, while identity is
transitive. For CIx to be phenomenally identical with CIy, it is not enough
they match each other. They must each match everything the other does
(Goodman 1951; Clark 1993).
Adopting this analysis of look identity has some nice advantages. It enables
construction of an ordering of perceptual lightness based solely on judg-
ments of matching. Subjects are not required to provide explicit ordering
judgments. It would, however, complicate analysis of L-error to trace out the
implications of employing this account of appearance identity, and I will not
pursue the issue here (see Schwartz 1996). More pressing problems lie ahead.
Solipsism
Suppose x ≠ y, the difference is quite small, and S judges CIx = CIy. Once more
there is R-error with no L-error. Altering the definitions of “look identity” and
“correct look,” though, does not seem the only or easiest way to avert this
anomaly. Weakening the demand giving rise to R-error would seem a simpler
solution. Stipulate that R-error occurs only when the reflectance difference
exceeds a specified range. If the difference between the targets is less than
the threshold, there is no R-error, and hence no need to appeal to L-error to
explain the mistake.
This sort of response can only be taken so far. The problem is the current
definition of L-error is “solipsistic.” The notion of correct look is specified
solely with respect to judgments of how things look to an individual subject
under CI. And this individualistic conception of looking right leads to trouble.
For suppose two surfaces differ enough in reflectance so that under ideal con-
ditions they are easily discriminated by the average perceiver. If S cannot tell
such targets apart under CI, it seems clear S makes an error. But an error of
what kind? There is no problem attributing R-error; S fails to discriminate be-
tween reflectance differences beyond the allowable range. S’s R-error, never-
theless, cannot be attributed to L-error, since it occurs under CI.
Were the deficiencies with S’s judgments confined to small threshold-type
cases, the failure of L-error to underpin R-error might not be very bother-
some. Unfortunately, the issue runs deeper. For all intents and purposes,
S could be “lightness blind.” Under ideal conditions, S might perceive most
achromatic reflectances as the same medium gray, or perceive them as a single

dark gray up to some reflectance value and a single white for higher values.
And if such radical lightness blindness is too farfetched to consider, the basic
solipsistic point can be made assuming only that some people are significantly
deficient in lightness discrimination. The comparable case of color blindness
is well known.
While technical repairs might be sufficient to patch up earlier difficulties,
the present problem is one of principle, requiring a major shift in perspective.
As things stand, a subject’s lightness perception can be vastly deficient, but
things will still be said to look correct. Accordingly, a subject may lack con-
stancy on a grand scale, yet remain L-error free.
It should be noted that these solipsistic difficulties are not due to general
skeptical or philosophical worries about the contents of other minds—wor-
ries over whether we can ever know how things really look subjectively to
someone else. The failure of the lightness deficient to discriminate, where the
rest of us do, is enough to show something amiss in how things look to them.
The situation is not at all like the paradoxical case of spectrum inversion.
With spectrum inversion, subjects make all the discriminations the rest of us
do, but the supposition is that things look differently to them. Lightness de-
ficiency raises no like issue of an, in principle, impossibility of testing.
Abandoning looks?
Does the case of lightness deficiency mean that the notion of a correct look
should be abandoned, and with it the idea of L-type error? Right off, that
would seem an overly hasty conclusion. If claims are limited to normal per-
ceivers, it might still be possible to say something useful about errors of ap-
pearance. The definition of L-error need not be changed, only its application
is restricted to persons with non-defective vision. The correct look of a surface
for a normal subject, S, is the look it gives S under CI. Correct-look and L-error
remain individualistic notions, that is, specified relative to a given perceiver.
Although, again, there is no need to assume it is possible to determine whether
the subjective experiences of different people are subjectively identical. The
restriction to normal perceivers merely serves to avoid the difficulties posed
by the lightness deficient. It is not meant to resolve, or to depend on, resolu-
tion of inverted spectrum type quandaries.
The initial limitation to the normal sighted does not preclude attributing
some errors of appearance to those with defective vision. Many of the judg-
ments of a lightness deficient perceiver, S, will be R-errors with respect to the
standards set for normal persons. S has matching perceptions where normal per-
ceivers experience the targets as non-matching. In these cases, it may seem rea-
sonable to make the minimal claim that S’s appearances can not both be right.
Then again, it is not clear what is gained by extending the notion of L-error
to the lightness deficient. It is, after all, the pattern of R-errors that is relied on
to determine if a subject’s lightness perception is defective. And since the no-
tion of R-error is thoroughly general, it can be used to explore S’s achromatic
color constancy for any pair of reflectances, under any set of conditions. It
might seem possible, then, to say most everything worth saying of the defi-
cient perceiver’s visual competence without appeal to the more troublesome
idea of an L-error.
Such considerations, in fact, raise questions about the importance of hav-
ing a notion like L-error on hand. For what was just said about the lightness,
deficient holds largely for normal perceivers. Before S can be certified to be a
normal perceiver, S’s R-errors must be examined. But once we have mapped
out S’s successes and failures in matching reflectances, is there really a need
for the concept of L-error in the study of perceptual constancy?
The prospect of not having to deal with L-error and the question, “How do
things look to subjects?,” will strike many as a welcome relief. By so doing,
psychophysics is nicely externalized, if not behavioralized. On one side, there
is lightness difference defined solely in terms of physical reflectance. On the
other side, there are people’s overt judgments of matching. Nowhere does
concern about the qualitative aspects of subjective experience obtrude.
The problem with abandoning “looks talk,” however, is that along with
gains in simplicity and methodological purity there are seeming losses. Re-
call the felt need to say something richer about S’s perceptual experience in
order to pin down the source of R-errors, or to determine whether the target
appears as it really is, or to indicate when S’s matching judgments were right
by accident. Setting standards for both CI and normal vision appeared to pro-
vide the wherewithal to account for many of these aspects of achromatic
colour constancy.
Nevertheless, talk of how things look to individual perceivers seems to in-
troduce an additional subjective element into the study of lightness. And the
need to relativize the specification of the correct look, to normal perceivers

and ideal conditions, may strike many as too high a price to pay in order to
make invidious distinctions among perceptual appearances.
Reliable methods
Given these worries, an alternative approach may be worth exploring. Much

of the explanatory mileage achieved from the correct look concept can be ob-
tained by other means, by appealing to a notion of reliability. Consider the
cases of accidentally correct matching judgments. Although S’s judgments
are sometimes accurate in contrast illusion conditions and in poor illumina-
tion, these comparison conditions are generally not good ones for lightness
evaluation. In both examples, S gets things right using an unreliable compar-
ison procedure. And therein may lie good reason for calling these judgments
accidents. At the same time, the correct judgments a perceiver makes when
both targets are under CI are not accidental, since this set-up is, by and large,
reliable. A similar approach may be taken to the task of pinning down blame
for R-errors. If x = y and Cjx ≠ CIy, fault can reasonably be attributed to Cix, as
long as comparing targets under CI is a reliable method for making lightness
discriminations, and Ci is not.
The need to appeal to the notion of a “reliable method,” nevertheless, does
raise serious doubts about the whole idea of a correct look. For suppose
lightness discrimination were at a maximum under two different sets of
conditions, CI and CI*. Both methods would be reliable, yet targets of the
same reflectance might not match under these conditions (i.e. if x = y, CIx ≠
CI*y). In these circumstances, there would be no basis for claiming CIx
versus CI*y is the correct look, and no basis for assigning the R-error in their
failure to match.9
Similar considerations serve to loosen intuitions about the connection
between accidental successes and the notion of the correct look. For suppose
x = y and Cix = CIy, but the reason they match is an accident. Ci involves two
non-optimal conditions that, in this case, happen to cancel each other out;
for example, an unnaturally intense illumination and a background reflec-
tance much greater than x. Since Cix = CIy, Cix has the correct look. If, though,
matching judgments in general are not accurate when targets are under Ci,
the method is not reliable. Success under Ci is an accident, albeit, everything
may look as it should.
Ideal conditions
Until now the assumption that the Munsell condition, CM, may be an ideal
condition for perception has gone unexamined. Justification for this claim
needs further examination, for the notion of an ideal viewing condition is not
all that clear. The simplest explication might seem to be in terms of reliable
methods and R-error. A condition is ideal if it is optimally reliable for lightness
discrimination. There is no other condition under which normal perceivers
make fewer R-errors. So understood, optimal reliability depends on the cho-
sen allowable threshold for R-error. For example, two different conditions
may both satisfy the criterion when the range for error is x ± n, but when the
range is narrowed to x, only one may meet the specification. To deal with this
possibility, it might be preferable to define “optimal” in terms of yielding the
fewest R-errors within the narrowest appropriate reflectance range.
The situation, though, could be more complicated. One condition may
lead to fewer errors when x ± n is the allowed range, while resulting in more
error when the range is narrowed to x. At the same time, the error rate for both
methods could be considerably higher than it is with the wider range x ± n.
There are trade-offs between error reduction and precision. Thus there may
not be a unique characterization of optimality, and there may be more than a
single condition meeting any optimality standard adopted (see Helson 1943).
Justification of a particular viewing condition as ideal, depends, therefore,
both on the criterion of optimality selected and on empirical findings about
how well the condition fares in competition with other viewing conditions.
And no condition may be unique in meeting these demands. Leaving final
resolution of these matters aside, is it reasonable to assume the Munsell con-
dition will qualify? One problem with this assumption is that lightness dis-
crimination is thought to be somewhat better when the illumination is
higher than it is under the Munsell condition. And this possible flaw with CM
raises an interesting question about the policy of identifying ideal conditions
with those optimal for lightness discrimination.
Discrimination might turn out to be best when the level of illumination is
well beyond that ordinarily encountered in daylight or in typical artificial
light. Or R-error could be least when targets are viewed in some specially pre-
pared non-white light or against a specially prepared background. Were this
the case, the optimal and hence ideal condition would be a condition seldom,
if ever, found in everyday perceptual tasks.
The Munsell condition might still emerge as ideal if typicality considerations

are taken into account. Those viewing conditions securing better discrimina-
tion than CM may be unusual enough to be eliminated from consideration.
For practical everyday use, there may be no point in specifying as ideal a con-
dition hardly ever encountered in everyday lightness judgment tasks. Justi-
fying CM as ideal would, nevertheless, remain problematic, but now for a
different reason. It is hard to claim that the Munsell condition is itself very
typical. The precise lighting, background, viewing distance, and viewing angle
specified are not those in which we usually find ourselves. Perhaps it could be
argued that CM, although not ecologically prevalent, is a good representative
of more ordinary conditions. Incorporating this idea would, of course, con-
siderably complicate the analysis of ideal conditions and further relativize
an account of error.
Standards
The distinction between reliable and unreliable methods was introduced to

handle intuitions about lightness perception while avoiding various difficul-
ties with the notion correct look. Adopting this approach, however, does not
eliminate the need to appeal to perceiver-relative standards. Criteria for an
ideal or reliable condition make reference to normal perceivers and optimal
set-ups (which may depend on notions of “typicality” or “representative-
ness”). Would not perceptual theory be better off if even these vestiges of rel-
ativity or non-objectivity were expunged from psychophysics?
Although this goal of purifying the study of lightness of any appeal to stan-
dards or norms of perception may sound attractive, it is misguided. The
underlying rationale for the perceptual study of lightness relies on such con-
siderations. The physical property of reflectance is a concern of psychological
investigation, because differences in reflectance normally result in different
lightness experiences for normal subjects. And useful talk of perceptual error
presupposes standards of correctness, standards that take account of these
norms of subjectivity.
Anchoring evaluation of gray-scale error to reflectance is an empirically
constrained choice, depending on both the nature of human visual capaci-
ties and the interests we have in describing them. It is a reasonable practice,
because there is a fairly robust correlation between levels of reflectance and
the achromatic color experiences of normal perceivers. Study of gray-scale
perception, however, does not require nor presuppose commitment to the

idea that achromatic color perception is a function of any single dimension of
surfaces. Experience of white, black, and gray, like the experience of chromatic
color, might have evolved so as to depend on more complex or gerrymandered
sets of physical properties. Or the normal visual system could have been such
that it simply split the physical reflectance scale in two. Reflectances above a
certain level are experienced as white and below that level as black. Or the ex-
perienced order could have been circular, with reflectances at the high and low
ends matching one another. If normal perceivers responded to reflectance
in these ways, it would be pointless to define error in achromatic constancy in
terms of deviations from simple reflectance values. The centrality of reflec-
tance in evaluations of lightness perception is only a fact in retrospect. It
emerges from considerations about the standard ways normal visual systems
experience differences in reflectance under assorted viewing conditions.10
Is R-error error?11
Throughout our discussion, R-error has been treated as a comparatively

straightforward case of error, although warnings were issued about this as-
sumption at the start. The just-concluded section should serve to remind us
that perceptually based standards of correctness obtrude even here. What’s
more, R-error is different from many ordinary cases of error. Subjects in match-
ing experiments are not in any obvious sense trying to measure or compare
physical reflectance values per se. They may have no idea what the term
“reflectance” means. Usually subjects are asked only if the targets match or
match in color, or if the targets are both made of the same material. R-errors,
therefore, need not be errors in terms of the subjects’ own avowed aims. If fail-
ure of lightness judgments to accord with reflectance is to be taken as R-error,
it must be with regard to considerations the experimenter brings to the task,
not ones subjects may be likely to articulate.
The firm conviction that R-error is unconditionally and indisputably error
has its root, I believe, in the widespread acceptance of what has come to be
called the “measuring device metaphor.” According to this metaphor, the vi-
sual system is a device for measuring physical properties of the environment.
More specifically, the function of lightness perception is to determine or mea-
sure reflectance. Subjective gray-scale experience is the imperfect device evo-
lution has given us to measure this physical property. Matching judgments
that do not correspond to sameness or difference in reflectance are failures to

meet the goal or function of lightness perception. It is with respect to this evo-
lutionarily established standard that subjects make R-errors.
Now I find it difficult to make good sense of claims about the purpose Na-
ture has written into our experiences of the gray-scale, especially when this
supposed aim is assigned normative status. But even if a case could be made
for claims about the real goal or function of lightness perception, nothing
precludes taking alternative stances to error evaluation as well. For other pur-
poses and projects it may be useful to evaluate performance with respect to a
different standard of correctness than reflectance.
In fact, there may be no pressing reason to think of the perceptual expe-
riences involved in R-error as being faulty or erroneous. There is another
option, and it is one I find appealing. Discrepancies between matching
judgments and reflectance values of surfaces maybe better understood as dis-
cordances between different ways of organizing our world, in particular,
discordances between phenomenal and physical orderings. Neither way must
be conceived as providing the complete and uniquely true story. R-error might
then be understood to result from discrepancies between two acceptable
versions of our world, one in terms of perceptually based categories and the
other in terms of concepts like reflectance, fashioned primarily for physical
theories of the environment.
Hesitance to adopt such a pluralistic attitude may be traced, I believe, to
residual essentialist metaphysical commitments. Ontologically speaking, it
is presumed, achromatic color is, and has to be, some physical property, like
reflectance. Reflectance is an objective feature of nature, and grouping sur-
faces according to reflectance serves to carve the world at its natural joints.
More phenomenally based ideas of achromatic color are shams or metaphys-
ically second-rate. They do not tell us what achromatic color really is, and,
from a scientific standpoint, they should, in principle, be eliminable. As the
measuring device model maintains, gray-scale experience is merely a fallible
subjective means for finding out about how things really are. Therefore, any
discrepancy between matching judgments and reflectance values is a mis-
take, since reflectance is the correct or true way to categorize surfaces.
Although such metaphysical intuitions are pervasive, I do not think they
should bother the psychophysicist, or, for that matter, anyone else. For there
is no reason to assume that there can be but one ultimately correct organiza-
tion of the world, or that the physicist’s analysis of achromatic color is onto-
logically privileged. The notion or notions of achromatic color needed for
physics may differ from those that best serve the needs of psychophysics or
optometry. These, in turn, may be different from those most suited to meet
the requirements of a carpet manufacturer, a lighting expert, or a museum re-
storer. Such concepts will flourish or fade on the basis of the work they do in
the areas they were designed to serve. The most the physicist, engineer, or de-
sign specialist can do is develop useful ways for categorizing the varied phe-
nomena of achromatic color that prove to be of intellectual or practical
interest. What else could or should be expected?
Claims that only one account of achromatic color can capture its essential
nature and specify what black, white, and gray really are, hinge largely on
preferred philosophical doctrines of essences and reality, rather than on sub-
stantive empirical considerations concerning perception. However, these
doctrines have no priority or pride of place in telling us what Is or is not Real.
Nor do they provide a higher or superior vantage point to rule on the number
or adequacy of alternative conceptions of our world. Indeed, if such philo-
sophical theories occupy any place, it will only be that of another kind of en-
quiry, epistemological or metaphysical, with its own constraints, interests,
and focus (see Schwartz 2000).
Conclusion
In this chapter I have attempted to explore the structure and complexity of

claims about perceptual error in a limited domain. I have zeroed in on a few
notions of “error” that seem to play a role in studies of achromatic color con-
stancy. I have further limited the analysis to matching tasks that do not ex-
plicitly raise issues of ordering. Even so, it seemed possible to talk about error
in quite different ways (e.g. R-error and L-error). And within each of these
types there were competing definitions, yielding conflicting decisions as to
whether a matching judgment is or is not an error.
Although I have explored some of the strengths and weaknesses of various
accounts of error, I have made no attempt to come down in favor of one, or to
dismiss any of the others. There are several reasons for my reluctance to do so.
First, at several places in the analysis there were choice points. For example,
it was left an open question how best to handle appearance identity when
faced with the non-transitivity of matching judgments, or how best to con-

ceive of reliable methods. Resolution of such issues will have an effect on any
precise specification of error.
Secondly, all the notions of error examined have their difficulties. Each is at
odds with some of our convictions, and no conception is likely to capture all
of our intuitions.
Thirdly, I see no reason to assume there is, or should be, either a single kind
of error or a unique characterization of error within a single kind. Any notion
of error must earn its keep by the service it performs in helping describe, sys-
tematize, and explain the facts of interest in gray-scale perception. This will
depend importantly on the task at hand.
Such a proliferation of error concepts will strike many as unsatisfactory. It
might seem bad enough to have to deal with errors of look in addition to er-
rors of reflectance. It would seem all the more untenable if the very same judg-
ment is classified an error on one account and correct on another. To alleviate
some of these qualms I have proposed, but not developed, the idea that phe-
nomenal and physical accounts of achromatic color may both have a role to
play in enquiry. In turn, discrepancies between these versions need not always
be thought of as errors. Adopting this more pluralistic approach, I believe,
can help deflate or avoid needless controversy and debate. Perhaps, though,
the most important point to emerge from our present study is that when it
comes to questions of perceptual error, things are not black and white.
Acknowledgments
In addition to my discussions with Alan Gilchrist, I have benefited from the

comments of Larry Arend, Margaret Atherton, Dieter Heyer, Dejan Todor-
ovic, and Paul Whittle.
Notes
1. Although limited to the achromatic case, I believe the analysis has implications for
the study of chromatic colours as well.
2. The symbol = is used throughout not for numerical identity, but for sameness of
stimuli, conditions, or experiences, as understood in studies of lightness perception.
3. In Gilchrist et al. (1999) the notion of error is not general but is relative to Munsell
viewing conditions. I discuss this matter below.
4. x, y, z . . . will be understood to represent either point values or, where appropriate to

the discussion, an agreed upon spread of reflectance values.
5. The Munsell book, a widely used reference work, provides color samples organized
according to a well-specified system of color ordering. (For a discussion of the Munsell
system and others, see Wyszecki and Stiles 1982.)
6. This claim cannot be taken to mean targets in bright illumination match surfaces
with higher reflectance than themselves. Sometimes they will; sometimes they will
not. An x in bright illumination will match a y of lower reflectance, if y is in even
brighter illumination or if y is displayed against an appreciably darker background.
7. The appropriateness of choosing the Munsell condition as standard will be dis-

cussed later.
8. It is, at times, assumed that the chart of Munsell chips serves as a measuring device,
on analogy with the use of the standard meter stick to measure length. Exploring the
pros and cons of this analogy requires more attention than the matter can be given here.
9. Consideration of phenomena like the “crispening effect” (Whittle 1992) in en-

hancing discrimination, although important, would further complicate issues and
cannot be explored here.
10. As it is, the simple one-dimensional account of gray-scale experience is the re-
sult of a certain amount of abstraction. If gray-scale phenomena are treated more like
other colors, and in matching tests chromatic near-gray surfaces or colored lights
are used, the picture of what is involved in achromatic judgment and error might be
quite different.
11. The positions and arguments merely sketched in this section are developed more
fully in Schwartz (1996).
References
Byrne, A. and D. R. Hilbert (eds.) (1997). “The Philosophy of color.” In Readings on color.
Vol. 1. Cambridge, MA: MIT Press.
Clark, A. (1993). Sensory qualities. Oxford: Oxford University Press.
Gilchrist, A. (ed.) (1994). Lightness, brightness, and transparency. Hillsdale: Erlbaum.
Gilchrist, A., C. Kossyfidis, F. Bonato, T. Agnostini, J. Cataliotti, X. Li, et al. (1995). A new
theory of lightness perception (unpublished).
———. (1999). “An anchoring theory of lightness perception.” Psychological Review 106,
795–834.
Goodman, N. (1951). The structure of appearance. Cambridge, MA: Harvard Univer-

sity Press.
Helson, H. (1943). “Some factors and implications of color constancy.” Journal of the
Optical Society of America 33, 555–567.
Munsell Color Company (1976). Munsell book of color. Baltimore: Munsell Color.
Schwartz, R. (1996). “Pluralist perspectives on perceptual error.” In Pluralism: theory

of knowledge, ethics, and politics, (eds. G. Abel and H. J. Sankueler). Hamburg: Meiner
Publisher.
Schwartz, R. (2000). “Starting from scratch: Making worlds.” Erkenntnis 52, 151–159.
Whittle, P. (1992). “Brightness, discriminability, and the ‘Crispening Effect’.” Vision

Research 32, 1493–1507.
Wyszecki, G. and W. S. Stiles (1982). Color science: Concepts and methods, quantitative
data and formulae, 2nd edn. New York: Wiley.
Prescript: 14
In the spirit of pluralism, this essay argues the need for both phenomenalist
and physicalist accounts of color. It also questions the significance of claims
that one version is epistemologically primary, conceptually constitutive, or
ontologically more basic. Limiting the analysis to achromatic color, here as in
chapter 13, has advantages. It avoids complexities of the optics, physiology,
and psychology of chromatic color phenomena. A disadvantage is that in
avoiding these complexities, it can make problems concerning color seem
more tractable than they actually are. Similarly, dividing theories of (achro-
matic) color into two broad classes, phenomenalist and physicalist, allows for
simplification in presentation and argument, but it, too, can distort. Reliance
on this dichotomy is not meant to suggest that there is a sharp, well under-
stood line of demarcation separating these rough and ready umbrella cate-
gories. Nor is it meant to suggest that one is needed.
It is surprising to hear claims about the physical nature of color, as if there is
a single concept of real color studied in the natural sciences. The assumption
of a unique core conception of phenomenal color is more dubious. Color talk
serves different purposes in physics, chemistry, biology, and engineering. It
speaks to still other concerns in studies of art, color blindness, interior deco-
ration, the manufacture of paint, and psycho-physical color orderings. The
idea that all these uses can be reduced to or shown to supervene on one privi-
leged conception of color is more wishful thinking than justified supposition.
Alternative conceptions of color are legitimate, and objective theoretical and
empirical practices have grown up around their employment.
14 Pluralist Perspectives on Perceptual Error*
Psychophysics, these days, is dominated by the measuring device or pho-

tometer model of perception. On this account, the goal or function of vision
is to obtain information about physical properties of the environment. Phe-
nomena are but a means to this end. Perhaps the simplest example of this ap-
proach is found in the study of achromatic color perception.1 Black surfaces
reflect little light, whites most, and grays varying amounts in between. The
ratio of reflected light to incident light is called “reflectance.” “Lightness” is
the term used for perceived reflectance, the experiential correlate of this phys-
ical property.2 Perception, then, is said to be “veridical” if experience of the
gray-scale “corresponds” to reflectance. Surfaces of the same reflectance must
look alike, and those of differing reflectance must fail to match. It is known,
however, that our visual system does not always work this way. For example,
placed on sufficiently different backgrounds, surfaces of identical reflectance
do not appear the same, while surfaces of unlike reflectance may match. Ac-
cording to the measuring device model, when this happens perception is in
error: things do not look as they should.
In Avoiding errors about errors, I explored technical details of this account.
I suggested that avoiding certain inconsistencies and difficulties required
adopting a less dismissive view of the phenomenal domain. In particular, I
suggested that not all discordances between physical versions and phenome-
nal versions are well-characterized as perceptual error. Reluctance to take
such a more even-handed treatment of the phenomenal rests, I believe, on
misguided metaphysical doctrines, doctrines I here hope to dispel.
To aid with this project, consider the plight of a psychophysicist attempt-
ing to run a typical perceptual experiment on an all too clever subject named
Gwen. Gwen is presented two wood chips. One, (a), is lying on a black back-
ground; the other, (b), rests on a white surface. Gwen is asked if the two chips
appear to be the same color. To the dismay of the experimenter she answers
“Yes and no. The two chips look the same, so, yes, they have the same color
appearance, but taking into account the differences in backgrounds, they
must be coated with paints of different reflectance.” Although Gwen’s seem-
ingly contradictory yes and no reply is readily understood, her answer is not
quite what the psychophysicist is looking for. The problem is Gwen’s percep-
tual experience is assumed to be in error, yet her perceptual judgments each
in their own way seem correct.
In order to force the issue the experimenter rephrases the instructions.
Gwen is asked if the chips perceptually match and is told to respond simply
yes or no. She says “Yes.” Now the psychophisicist feels better placed to accuse
Gwen of making perceptual error. Gwen said the chips match, but they are
each covered with paints of non-identical reflectance. Notified of her error,
however, Gwen expresses surprise. “Sure, the paints have different reflectance,
I said that before. All I have claimed is that under the conditions of presenta-
tion (a) and (b) have the same appearance. So where is my error, where have I
gone wrong?”
At this stage, it is hard to tell who is more frustrated, subject or experi-
menter. In any case, the test is run one last time. Gwen is instructed to tell if
the chips present the same real color. To the psychophisicist’s chagrin Gwen
replies, “Well, yes and no. They really do appear the same, so they have the
same color appearance. Yet they must be covered with paints of different re-
flectance, so their physical colors are not really identical.”
It should be obvious the dialogue between the cagey subject and the caged
experimenter is going nowhere. As long as Gwen does not claim the chips
have the same reflectance or something similar, she has said nothing false
about the physical layout. She would, of course, have made a mistake if on the
basis of the matching appearances she claimed the chips are covered with
paint of identical reflectance. But likewise, if on the basis of her belief about
this difference in paint pigment, Gwen predicted (a) and (b) will look differ-
ent under the experimental setup, she would also have been mistaken. This
time her error would be with respect to appearance, not reflectance. What’s
more, errors of either sort can have disastrous consequences. The painting con-
tractor, who seeing that (a) and (b) match in appearance, uses them inter-
changeably, may lose his job. The camouflage novice, who knows the paints
are different, but fails to appreciate that they match in appearance under var-
ious conditions, may lose his life.
Pluralist Perspectives on Perceptual Error 235
As just indicated, appearances can be deceptive when they lead to incorrect

judgments of the reflectance properties of objects. Appearances may deceive
in other ways too, in ways that do not appeal directly to notions like reflec-
tance. Error can arise, for example, if a distinction is made between appear-
ances that match and appearances that are instances of identical qualia. If
reflectance differences are slight, x may match y, y may match z, but x and z
may not match. One approach to the intransitivity of matching is to specify
that qualia are the same if and only if every appearance that matches one
matches the other.3 On occasion then, a subject’s experience of matching can
mislead when taken to entail an identity of qualia.
Another kind of appearance error involves deviations from social norms.
Suppose under the stated test conditions, although (a) and (b) look alike to
Gwen, they appear differently to most subjects. Gwen may then be deceived
about the appearances she assumes others will have. Similarly, Gwen might
be mistaken in the expectations she has about her own perceptual experi-
ence. Seeing that (a) and (b) match, Gwen may incorrectly think they will re-
main indistinguishable if she views them both on a white background. Other
types of mistakes within and between Gwen’s phenomenal and/or physical
judgments can be significant.
The existence of these multiple ways to get matters wrong should warn,
nevertheless, against automatically treating all discordances between physi-
cal and phenomenal versions as instances of perceptual error. For certain
purposes, in certain contexts, and measured against certain standards, phe-
nomenal judgments may be out of step with physical descriptions and still be
adequate or correct. They may be just what is required for the task at hand. It
only misleads to insist that matching judgments that do not jibe with de-
scriptions in terms of reflectance show perception is non-veridical and intrin-
sically in error.
The ordering of surfaces in terms of reflectance is one way to organize the
world. The organization of the gray-scale in terms of phenomenal likeness
and difference is another. Obviously, the phenomenal way does not adhere to
the exact and exacting identity conditions reflectance provides for grouping
surfaces as the same. Some differences in reflectance are below the threshold
of detection. What’s more, categories the human perceptual system finds sa-
lient need not respect the boundaries sanctioned by an organization of achro-
matic color in terms of reflectance and related properties. Mistakes do result
when one ordering or categorization is used incorrectly to predict or organize
the other domain. But again, these interdomain errors can go in either direc-
tion and can be equally costly. Recall the cases of the painting contractor and
the camouflage novice.
This is not to deny there are important distinctions between phenomenal
versions and physical versions.4 Nor is it to claim that both sorts of schemes
are equally useful in every area. The differences, though, are largely pragmatic.
The firm conviction of many psychophysicists that any lack of accord be-
tween phenomenal and physical judgments means perception is faulty de-
pends, I think, on a conviction that the physical version, the version in terms
of reflectance, is fundamental. Thus the function of vision must be to deter-
mine reflectance, since it is this physical property, not any phenomenal coun-
terpart, that specifies the way the world really is.
Elaboration and defence of a claim for privileging physics is highly prob-
lematic. There is a vast, non-conclusive, literature on reduction, theoretical
identity, and supervenience attempting to elucidate a thesis of ontological
priority. Other attempts have sought to establish the superiority of physical-
ist accounts on more epistemological grounds, with little success or even
consensus on approach. I am doubtful these ontological or epistemological
rankings can come to much when not drawn along pragmatic lines. But it is
not necessary to defend this assumption here. Privileging physics is compat-
ible with recognizing the value and need of other schemes of organization.
The issue is doubly irrelevant to psychophysics. The phenomenal ordering
and organization of the gray-scale provides the very rationale for its percep-
tual study. The physical property of reflectance would be of no concern to
psychology were it not for the way our perceptual system responds to it. If
psychophysics is to be an interesting domain of inquiry, psychological phe-
nomena and their accompanying judgments of appearance must be given
their due.
Further impetus for privileging properties like reflectance is the result of
some confusions concerning the subjective/objective distinction. Science does
strive to be objective, and so seeks to distance itself from biases and influences
that can intrude upon the quest for knowledge. Claims of post-modernists
aside, science is more than making up stories that are subjectively persuasive.
Theories must face the evidence and account for it in ways that meet stan-
dards of consistency, relevance, explanatory cohesion, simplicity, etc. And
even this is not enough, if a competing theory does the job better. Such
methodological scruples, however, do not preclude studying the structure of
appearances. There is, after all, a distinction between objectively studying

the subjective (i.e. gray-scale experience) and subjectively (i.e. unconstrained
by scientific standards) investigating anything.
Often a different but related reason is given for not paying heed to the
framework of subjective appearances. The claim is not that the domain resists
objective investigation. Instead, it is argued that by their very nature subjec-
tive properties do not tell us how the world objectively is. They offer a picture
of what the world is like to us, but not what the world is like in and of itself. The
latter is only provided by a framework that is objective in the sense that it
does not rely on categories or concepts shaped by human subjectivity.
From this standpoint, it seems quite natural to adopt the measuring device
model. The subjectivity of the realm of phenomena, its dependence on the
subject, stands in the way of direct contact with the objective world. When
there is discordance, then, between phenomenal matching and reflectance, it
follows that visual experience must be blamed. Only the physical version pro-
vides a picture of how reality is unprejudiced by the biasing impositions of
human perception and cognition.
This argument for privileging the framework of reflectance over a phenom-
enally oriented scheme can not sustain serious scrutiny.5 Parceling our world
into categories based on reflectance is a particular way to order and organize
our encounters with the environment. It provides an account that is very use-
ful in a wide variety of contexts. Yet, for all this, it is still a version and only
one version of our world. It is not the world itself, nor a mirror reflection
of the world as it is, in itself, pre-sorted or divided. Categories and orderings
based on reflectance are as much constructions of the mind as the shades of
gray to which they give rise.
When we evaluate phenomenal judgments in terms of their accord with re-
flectance values, we are pitting two versions against one another. We are not
testing or measuring appearances against what’s there, plain and simple. For
what could this amount to? We are, rather, comparing phenomenal judg-
ments against those physics has to offer. Lack of agreement between the
schemes is not to be understood as the failure of perception to correspond
with how things are tout court.
If the narrow-minded Realism of the measuring device model misdirects
inquiry at one end, unconstrained Idealism threatens at the other. Indeed,
fear of landing in this latter camp makes it impossible for many theorists to
give up the idea of a version-independent world waiting to be carved at its
ready-made joints. These fears, though, are unnecessary. The Realist’s claim
that there can be a version that describes the world as it really is, independent
of the way it is conceived by any version must be dismissed. It lacks coherent
content or ends up postulating a Kantian realm of things-in-themselves hav-
ing no role to play.
That theories cannot be tested against an unconceptualized world: what’s
there, does not mean our constructions are unconstrained, that all accounts
are equally good, or that predictions and proposals cannot be evaluated for
truth or correctness. The categories used to order the world must do work
to earn their keep. Versions that do not organize the environment in ways
that serve intellectual and practical needs, as well as meet relevant norms of
inquiry, have no lasting claim on our understanding or imagination. More-
over, the thesis that versions are tested against other versions, is clearly at odds
with the idea that theories are unchallengeable constructions of the mind.
Nor does the pluralism of alternative schemes of categorization and the com-
peting versions they are used to express, preclude setting vigorous standards
and norms.
Notions of error are OK in their place. We do make mistakes within phenom-
enal and physical versions, and discordances between versions are real and
can bemisleading. Sympathy for not treating them all as error goes only so far.
It does not extend to denying that versions can be inconsistent, can mislead,
can conflict with better versions, or may not pan out in a host of other ways.
Talk of multiple adequate versions, along with the denial of there being a
version that gets at Reality unfiltered by any human contribution, can be lib-
erating. Unfortunately, the liberty is often misinterpreted. Many, we have just
noted, incorrectly assume pluralism entails there is no way to get things
wrong, that all versions are thus immune to objective criticism. A small, albeit
growing, number of psychophysicists take the opposite extreme, and along
with it reject the measuring device model. They accept the idea that we only
come to terms with the world via our versions of it, but then assume that we
can never really be in touch with Reality. Since all we know are our models or
(re)presentations, we are perforce always trafficking in illusions.
Labelling all our versions, both physical and phenomenal, as illusions may
be a nice trope, yet it does not have much literal punch. We can and do make
distinctions among versions. There is a difference between seeing a chair that
is actually there to be sat on, and hallucinating a pink elephant that is not
there to be fed. If I assert there is a chair straight ahead, I have said something
true that will serve well to guide cognition and behavior. If I claim there is a
pink elephant a few paces away, I have uttered a false sentence, and I am de-
luded. If I continue to see pink elephants I run the risk of being hospitalized.
The cause of my hospitalization is an illusion, the hospital is not.
The obviousness of these last remarks make the thesis of pervasive illusion
itself seem like an illusion. Why is it, then, that vision theorists succumb
to it? I think the answer is that even proponents of this radical illusionist-
Idealist model harbor unrelenting Realist convictions. They correctly under-
stand we have no access to a world as it is, stripped or independent of the
perceptions and conceptions employed to order and organize it. Neverthe-
less, they cannot give up the idea that there is such a world. But then episte-
mological crisis is inevitable. We have no way of making contact with this
realm of things-in-themselves; all we have to go on are our (re)presentations.
Given that we only perceive such (re)presentational surrogates, we can not
truly be said to see the Real world. All experiences of the environment are thus
illusions. In turn, for all we know or can ever know our theories may be
wholly at odds with the-way-the-world-is.
The solution to this skeptical dilemma is to let go off the Realist intuitions
causing the trouble. There is no escaping our perceptions and conceptions so
as to confront the ready-made world head on, as it really is. For there is no
clear sense what this could be. Any attempt to articulate the nature of such a
confrontation, to fill in the details, will of necessity result in just another ver-
sion, perhaps one from a purportedly more lofty metaphysical perspective,
but a version nonetheless. This inability to step outside ourselves does not
mean our versions are empirically untestable or myths. There are important
distinctions between versions that are illusions and those that are not, be-
tween versions that are fact and those by intent or inadvertence are fiction,
between versions that are correct versus those that are in error, between ver-
sions that work and those that stand in the way of advancing understanding.
The account of psychophysics being recommended has strong affinities to
pluralist, Irrealist ideas Nelson Goodman has long defended. The position
has much in common, too, with classical Pragmatism. And adopting it does
require sacrificing cherished doctrines. It entails forgoing a quest for cer-
tainty, freeing up views about truth, and tolerating a pluralism of versions.
Still the losses are tolerable, and I, at least, do not see a better option. In psy-
chophysics the main alternatives seem to be either to adopt a Realist measur-
ing device metaphor or an Idealist world as illusion metaphor. I believe
nothing empirically significant hinges on a commitment to either of these

pictures, and a less problematic account of perception may be in the offing
once both are abandoned.
Notes
* This paper was written while a Fellow at the Zentrum für Interdisziplinäre Forschung
at the University of Bielefeld. I wish to thank the Center for its support and the mem-
bers of the research group for their input. Several members should recognize sketches
of their own position being examined.
1. The issues to be considered are closely related to current heated debates in the philo-
sophical literature over the nature and perception of chromatic color. (Hardin 1993,
Hilbert 1987, Thompson 1995.) Space limitations preclude my spelling out these
affinities.
2. For more precise specifications see Wyszecki and Stiles 1967.
3. Cf. Goodman 1966, Clark 1993.
4. Talk here and before of a difference between the phenomenal and the physical is not
meant to suggest an ontological or metaphysical divide. Phenomenal versions and
physical versions offer alternative frameworks for description and prediction. (a) and
(b) may be phenomenally the same and physically different, and such cross categoriza-
tions are all that concerns me.
5. I have developed these arguments further in Schwartz 1986 and have explored some
of the ramifications for a theory of spatial perceptions in Schwartz 1994.
References
Clark, A. 1993. Sensory Qualities. Oxford: Oxford University Press.
Goodman, N. 1966. The Structure of Appearance. Indianapolis: Hackett Publishers.
Hardin, C. L. 1993. Color for Philosophers. Indianapolis: Hackett Publishers.
Hilbert, D. 1987. Color and Color Perception. Stanford: Stanford University Press.
Schwartz, R. 1986. “I’m Going to Make You a Star.” Midwest Studies in Philosophy 11.
———. 1994. Vision: Variations on Some Berkeleian Themes. Oxford: Blackwell Publishers.
———. 2004. “Avoiding Errors About Error.” In Colour Perception: From Light to Object,
R. Mausfeld and D. Heyer (eds). Oxford: Oxford University Press.
Thompson, E. 1995. Color Vision. London: Routledge.
Wyszecki, G. and W. S. Stiles. 1967. Color Science. New York: Wiley.

Prescript 15
Parts of this essay started life as comments on Michael Thau’s “What is Dis-
junctivism?” at the 35th Oberlin Colloquium in Philosophy. Both his paper,
only a small part of which was presented, and a version of my comments were
published in Philosophical Studies (120, 1–3, 2004, pp. 193–253, 255–263).
Thau’s paper has two primary aims: (i) a critique of Austin’s attack on Ayer
in Sense and Sensibilia and (ii) a rejection of McDowellian disjunctivism in
favor of Thau’s own solution to the “objects of perception” problem. In chap-
ter 15, I largely leave aside Thau’s paper and focus instead on the framework
of the disjunctivism issue itself. Although whole paragraphs are lifted from
my published paper, this new essay explores issues not touched on and de-
velops lines of thought only indicated.
Disjunctive perplexities about the objects of perception, stand in nice con-
trast to the issue discussed in chapter 12 on the perception of objects. The latter
continues to provoke discovery of interesting empirical phenomena even
when theoretical claims do not accord well with the notion of an “object”
employed. Qualms with the “objects of perception” debate are different. The
positions defended are constrained minimally, if at all, by studies of vision.
They are instead responsive to the epistemic, linguistic, and metaphysical in-
tuitions of each participant. Everyone gets to champion his or her favored
solution without being much bound by common sense beliefs, empirical
evidence, or substantive theoretical demands. Austin, I think, has it right in
Sense and Sensibilia. Scrap the philosophical staging that gives rise to the issue.
For specific needs and local purposes “object of perception” talk can be clear
and useful, but nothing especially significant follows from these practices.
The main goal of chapter 15 is to support Austin’s effort to deconstruct the
problematic. Themes and arguments encountered earlier in this volume re-
verberate throughout the essay.
15 An Austinian Look at the “Objects of Perception”*
Those who . . . revolt against a dichotomy to which they have been addicted, com-
monly go over to maintain that only one of the alleged pair of opposites really exists at
all. . . . [and then preach] with the fervour of a proselyte a doctrine of “one world.” Yet
what has ever been gained by this favourite philosophical pastime of counting worlds?
And why does the answer always turn out to be one or two, or some similar small, well-
rounded number? Why, if there are nineteen of any thing, is it not philosophical?1
J. L. Austin
I first read Austin’s Sense and Sensibilia at a time when it was pretty much a re-
quired text for anyone wishing to be philosophically informed.2 Like other
readers it seemed to me that various of Austin’s verbal barbs were not only a
bit condescending, but they seemed to miss the mark of their intended target.
I was frustrated, too, by Austin’s brief, end of the book treatment of Berkeley,
as told to him by Warnock. I thought that in focusing on epistemological is-
sues, Austin, like other critics, failed to appreciate the significant contribu-
tion Berkeley’s ideas had on the scientific study of vision. Still, I found the
book an exhilarating read.
What I liked most about Sense and Sensibilia is that it provided a rationale
for ignoring certain philosophical problems then in vogue while maintaining
a reasonably clear conscience. Austin showed, to my satisfaction at least, why
these metaphysical quandaries were not issues one needed to address or take
a stand on. The way I read and continue to read Austin is that he is not so much
trying to refute the Argument from Illusion and its kin, but, to put the matter
in modern terms, he is trying to deconstruct the whole problematic. Reminis-
cent of James and Dewey before him, Austin thinks that the epistemological
and ontological assumptions that breathe life into these problems of percep-
tion rest on untenable dualisms. He says at the start that “It is essential here,
as elsewhere, to abandon old habits of Gleichschaltung, the deeply ingrained

worship of tidy looking dichotomies” (p. 3). Austin believes the questions
these old habits raise are put-up jobs, best dismissed and surely not worth
arguing over. Thus, he warns “I am not, then—and this is a point to be clear
about from the beginning—going to maintain that we ought to be ‘realists,’
to embrace, that is, the doctrine that we do perceive material things. This doc-
trine would be no less scholastic and erroneous than its antithesis” (p. 3).
Given this understanding of Austin’s project, I have never been very both-
ered by some of his purported missteps or misfired darts. For I do not think
that Austin is best read as offering knock down counter-arguments and coun-
terexamples to the claims of Ayer and others. Instead, Austin’s main goal is to
challenge the point of the questions asked and the significance of the con-
clusions drawn. Absent prior commitments to dubious philosophical doc-
trines, Austin can not see what theses like Ayer’s or their alternatives buy.
This picture of Sense and Sensibilia may help explain why the book does not
get nearly the attention it once did. Those like me, with permission granted
in part by Austin, no longer feel the need to discuss the Argument from Illu-
sion or deal with the problems of sense data and their ilk. Sense and Sensibilia
exerts an influence, but remains in the background and off course syllabi. By
contrast, people in the grip of the problematic are unlikely to find it espe-
cially useful to assign a book that challenges the very import of the topic they
intend to teach. Nevertheless, Austin’s presence lingers, and many of those
wishing to resuscitate the issue believe they can not simply ignore his argu-
ments. They feel a need to respond to Austin, if only briefly, before continu-
ing on their way.
From the start, critics of Sense and Sensibilia maintained that Austin fre-
quently misses his opponent’s point and is guilty as well of some of the very
mistakes he accuses them of making. Similar complaints surface today in ef-
forts to reopen the debates Austin wished to close down. Austin’s treatment
of hallucinations, for example, is one philosophical lapse recently cited. In
Sense and Sensibilia, Austin distinguishes hallucinations from illusions and
other perceptual errors. Illusions and veridical perceptions typically have
physical things as their objects, hallucinations do not. In drawing this dis-
tinction, it is held, Austin makes a costly concession. He is forced to admit
that in hallucinating the object of perception is some ontologically peculiar
ephemeral thing.3 I believe Austin would be quite surprised to learn that he
made or must make such a concession. For I see no reason why Austin would
An Austinian Look at the “Objects of Perception” 245
be inclined to go from “Samantha is having an hallucination of a tangerine

colored elephant” to “There exists a non-physical entity that Samantha is per-
ceiving (or is aware of).” For what could that entity be? I should think Austin
would dismiss the thought that the object Samantha sees is an immaterial ele-
phant and mock the idea that we can make sense of the claim that she per-
ceives or is aware of an appearance.
My confidence that he would resist making any such concession lies in part
in my belief that Austin wishes to deflate the very need for heavy-duty talk
about objects of perception. To appreciate this point, it is helpful initially to
consider Austin’s treatment of the notion “real.” Austin does not deny that
the term has appropriate uses in a variety of contexts and can serve well to
make local points. We do distinguish a pitcher full of real cream from one
containing a synthetic substitute. Yet we also distinguish a real pitcher of
the synthetic stuff from a hologram projection. None of these distinctions,
though, presuppose a tidy, fixed dichotomy between the real and the unreal.
Nor, without questionable assumptions and stage setting, do they entail on-
tologically significant claims about the nature of Reality or the possibility of
coming in contact with It.
Austin’s position with respect to the objects of perception is of a piece with
his treatment of real. With specific goals and contrasts in mind, we can un-
derstand, ask, and get agreement about what object is or is not perceived in
given cases.4 We make these distinctions without endorsing or assuming any
doctrine about the material versus the immaterial—another of the tidy look-
ing dichotomies Austin rejects. The legitimacy of this everyday talk does not
support claims that there is a realm of appearances or sense data. Nor does it
imply that we never see, or see directly, the real world.
Likewise, in everyday discourse it is useful at times to distinguish halluci-
nations from normal perception. Hallucinations are more readily traced back
to drugs or system malfunctioning than to properties of environmental stim-
uli or objects. Common sense and common concerns will take you this far.
But Austin maintains that philosophy is not likely to take you much further,
at least not along a path worth traveling.
Some see Austin’s reluctance to pursue these issues a sign of his superficial-
ity. I see Austin as holding that the deeper thought is that there is nothing
deeper to probe. Hence, I do not think Austin’s withdrawal from these philo-
sophical contests is a result of intellectual timidity or laziness. Nor do I trace
his reluctance to an uncritical willingness to accept the dictates of ordinary
language. Austin is well aware that sound scientific discourse frequently moves
beyond and may justifiably contravene everyday talk. That the physicist’s use
of the term “mass” is not that of the masses is no cause for concern. Austin
does tend to put stock in the pronouncements of the O.E.D., but he thinks
there are reasons to do so. Austin believes ordinary language evolves to meet
actual needs, and the subtle distinctions found in the entries of the O.E.D.
can reflect the culture’s efforts to cope with these demands. For instance, the
different dictionary entries for “unintentional,” “accidental,” and “inadver-
tent” are significant, because they capture distinctions that are important in
a number of social and legal contexts. Austin is convinced that all too often
philosophical jargon, unlike scientific, legal, and serious everyday talk, is not
substantively constrained by real needs. It earns its keep taking in the wash of
other equally dubious philosophical vocabulary. The notion of an object of
perception is an illustrative example.
Perplexities over objects of perception have been said to start early with
Plato’s claim in the Theatetus (160, b) that “whenever I come to be perceiving,
I necessarily come to be perceiving something; because it’s impossible to
come to be perceiving, but not perceive anything.” Once this principle is
adopted, however, questions about the status of misperceptions immediately
arise. In particular, what is it that is seen when a person hallucinates? One re-
sponse to the question is to deny its presumption. Hallucinations, are not in-
stances of “real seeing.”5 This move has some support from intuition and
ordinary language. Unfortunately, intuition and ordinary language also en-
dorse conflicting stances. For many, the idea that hallucinations are instances
of seeing (or that seeing is constitutive of the concept of visual hallucination)
is so compelling that abandoning Plato’s principle is hardly worth consider-
ing. After all, hallucinations can be phenomenally indistinguishable from
illusions and veridical visual experiences.
I, like Austin, am not totally clear what the problem of the objects of per-
ception comes to and much less clear what the constraints are for resolving it.
I can imagine pressure or support for a particular answer flowing from work
in visual theory. For example, the claim that perception is a two-step process
in which experienced sensations trigger perceptions has been taken by many
to postulate something akin to uninterpreted objects of perception. J. J. Gib-
son, to name one prominent twentieth-century vision theorist, so under-
stood the model, and his theory of direct perception is meant to challenge it.
(See chapters 1 and 8.) According to Gibson perception is non-inferential; it
does not depend on interpreting prior sensations. Thus Gibson claims that
his theory of perception supports perceptual Realism. No veil of sensation
stands between the world and perception of the world.6 David Marr’s notion
of a “primal sketch” and his levels of representation model have been thought
to raise comparable issues within computational theories of perception. Con-
cerns such as Gibson’s and Marr’s about the workings of the visual system,
however, seldom play a significant role in the philosophical objects of per-
ception controversies.
I also understand that problems in semantic theory may provide con-
straints on an answer to certain questions about the objects of perception. A
main goal of semantic theory is to assign logical forms to discourse so as to
capture accepted patterns of inference. For this purpose analyzing “see” or
“perceive” as two-place predicates may be best. Semantic questions of logical
form, though, do not seem to be at the core of objects of perception debates,
and it is good that they are not. The issue of logical form, in and of itself, is sev-
eral steps removed from substantive conclusions about the workings of the
world. That “height,” for instance, is treated as a two-place relation between
a person and a number does not provoke metaphysical worries about the
interaction of physical objects with abstract ones. Similarly, early discussions
of the logical form of statements of propositional attitude (such as those of
W. V. Quine and I. Scheffler) make it clear that treating attitudes as two-place
relations between subjects and sentences does not entail anything about a
subject’s possession or use of language-like entities. The same holds for sen-
tences about seeing. That it is logically perspicuous to analyze “Samantha is
aware of/has a thus and so experience” as a relational statement does not en-
tail there is some “thus and so” item that Samantha has on hand to inspect,
experience, or employ in visual processing.
Finally, the objects of perception problem cannot merely be to show that all
visual phenomena may be lumped into a single category rather than a dis-
junction of categories. The aim must be to show what can be better accom-
plished dividing them one way rather than another. For this task, ordinary
language and intuitions of principles do not seem to provide a firm guide. And
even if they did, why should these considerations have much binding force?7
A brief look at the standard tripartite division of visual phenomena into hal-
lucinations, illusions, and veridical perceptions may help indicate why.
Hallucination, it is often said, is distinguished from ordinary mispercep-
tions in that there is no physical object that is being seen. But is this so? In
discussing delusions, Austin mentions that there are two accounts of mirages.
One holds they are influenced by atmospheric refraction (perhaps due to the
presence of mist); the other maintains that this is not a factor.8 Are mirages,
then, hallucinations on the latter account and not hallucinations if refrac-
tion enters into the story? Might the mist itself be the object of perception in
spite of our being totally unaware of its presence? In any case, in hallucina-
tions there may very well be something physical that is seen even in cases
where atmospheric conditions do not intrude, namely, the desert environ-
ment that sets the backdrop for the imagined oasis. So are there two ontolog-
ically distinct objects in such hallucinatory experience, the immaterial oasis
and the material desert landscape?
Puzzles arise as well with accounts of perceptual “filling-in.”9 Apparent mo-
tion phenomena are typically classified as illusions. If in a dark room a square
figure and a circular figure are shown one after the other in time, subjects see
an object move across the spatial gap between them, transforming in shape
along the way. Of course, these apparent motion experiences have external
causes. Less clear is what, if anything, is being misperceived. Is it the square,
the circle, both, or the unoccupied dark space lying between them? If the
last, would that make apparent motion a hallucination? Alternatively, might
it be held that nothing, in fact, is being misperceived?10 (See chapter 7 on
visual supplementation.)
Filling in across the blind-spot raises related questions. Light striking the
retina at the blind spot has no appreciable perceptual effect. The filled-in ex-
perience is the same independent of the source of the light that strikes this
part of the retina. The light could be coming from an object corresponding to
the phenomenal supplementation, or from a non-corresponding form, or
from a blank surface. Indeed, the experience will be the same if no light hap-
pens to strike the retina at the blind spot. So are filling-in experiences veridi-
cal in some cases, illusions in others, and hallucinatory in others? Is there a
need to postulate immaterial objects to explain the phenomena? And do any
of these considerations tell for or against Plato’s principle?
The notion of veridical perception is equally fuzzy. Most everyone agrees
that there is an important distinction between seeing things correctly and
seeing them incorrectly. Also most everyone, including Austin, would grant
that we make rough and ready distinctions between getting things right and
getting them wrong. Austin, however, doubts there is a determinate full-
bodied notion of veridical perception underlying these judgments, and I
think there is good reason for his skepticism. It is no easy task to specify how
and to what extent ordinary perception truly grasps the facts or corresponds
to them in content.
In discussing veridicality, we usually have in mind feats of recognition or
categorization. Is the item in front of us a tomato, that over there a twig, not
a snake, and the stick in water straight, not bent? Such tasks, though, consti-
tute only a small part of perceptual activity. Suppose, instead, attention turns
to more metric spatial properties of the layout. People are not all that good at
judging size, shape, and distance in an absolute sense. When the comparison
items are spatially much apart, relative assessments, too, tend to be inaccu-
rate. Does this mean everyday visual experience is rife with misperception
and illusion? Claims of veridicality depend as well on how correctness is mea-
sured. My cognitive estimate of a given distance may be faulty, although I can
throw a ball right to the spot. And even when spatial judgments are on target,
how much is due to perception being veridical and how much to mental cor-
rection? If asked, I will judge that the stick in water is straight. Similarly, if
asked, I will tell you that the person walking away from me remains the same
size (approximately six feet tall) although his appearance grows smaller and
smaller. Yet were the person approaching, not retreating, I am likely to refrain
from making any size judgment until he comes quite close.
Color perception is another area where the issue of veridicality is not free of
difficulty. As discussed in chapters 12 and 13, there are problems in the rela-
tively simple case of achromatic colors (the grays from black to white). The
idea that an experience of a given shade of gray paint presents the gray as it
physically is or as it should be seen is of questionable sense. You always need
a background and there are no neutral backgrounds. Standard lighting con-
ditions, or those used to calibrate the Munsell color charts, are not the best or
ideal ones for discrimination. Also comparative judgments made in certain
setups said to engender illusory color experience can actually aid, not hinder,
discrimination. Yes, in particular contexts, for specific purposes, a rough and
ready labeling of perceptual experiences into veridical, illusory, and halluci-
natory may be of service. It is quite another story to assume that such dis-
course demonstrates that a unique, theoretically useful division of visual
states into veridical perceptions, illusions, and hallucinations is needed or
can be justified in terms of the processes, mechanisms, or functions of vision.
If neither empirical and conceptual considerations of vision theory nor
those of semantic theory substantially constrain solutions to the objects of
perception puzzle, what can? An obvious answer is that constraints can flow
from the demands of epistemology.11 Here again, Austin is skeptical. He be-
lieves it is largely the adoption of habitual, albeit ill-advised, dualisms that
keep the issue afloat. The analysis of the notion of “perceptual inference” of-
fered in section 2 of this volume and in VVBT lead me to side with Austin.
Philosophical solutions to the objects of perception puzzle all too often as-
sume something along the lines of a hard and fast given/taken dichotomy.
Most visual experience occurs with some stimulus to the system. I have ar-
gued, in the works cited above, however, that there is no single state or event
in the causal chain that can be deemed the fixed dividing line between input
and output, premise and conclusion, or vision and cognition. (See also chap-
ters 11 and 12.) Trivially, no input on its own is wholly responsible for the
character of visual experience. Visual experience results from contributions
of both the environment and the perceiver, and these contributions are inex-
tricably joined. What the environment gives can have no effect on percep-
tion, unless it is selected and taken by the visual system. This holds no matter
how far out into the environment or how far upstream past the retina one
searches in the causal chain. If inputs cannot be accommodated and put to
work, there is nothing useful on offer. The given of necessity is response de-
pendent; it is determined in the taking.12
There are, no doubt, differences worth noting in the degree to which the
properties of an input constrain the specifics of an output. For instance, in
cases like the mirage oasis the environment minimally shapes the qualities of
the visual experience. Were there an actual physical oasis in full view, the in-
put would have a much greater say in the properties of the output. Neverthe-
less, no place along the causal chain is inherently the point of origin of
perception, and no single output is in principle its final stage. There are can-
didates in-between and beyond, and with further elaboration and changes
in the story the intuitions and categorizations will shift. Of course, where
and when there is a particular theoretical need for a distinction, science
undoubtedly will find or stipulate one.
Does this mean that anything or any stage in the causal chain may be said
to be an object of perception? Not without stretching the bounds of everyday
intuitions and ordinary language practices. But what if we aim higher or dig
deeper in the hope of uncovering what the object of perception really is?
Austin, I believe, would suggest that it is better to abandon the concept object
of perception than to search for an answer. Starting down that line only leads
to trouble: Could it be that we never really perceive a tomato? Strictly speak-

ing we only see the front half of the tomato, since no light reaches the retina
from the rest. This too, may be overreaching, as it is really only the outer sur-
face of the front half of the tomato that plays a causal role. Before we know it,
the thought arises that what we really (I mean really) see or are aware of im-
mediately is nothing but our own subjective experiences.
As deviant as the last option is from common sense intuitions and ordinary
language, I suspect there are many who will find it at least comforting to be
back on familiar philosophical turf. Surely, there must be some correct answer
to the question “What is it that we perceive?” And it is simply absurd to accept
the reply that any state or stage along the causal chain can count as an object
of perception. I admit unusual, but absurd or false is another matter. Given
the freedom to make up any conceptually possible scenario that strikes our
fancy, I think it likely intuitions can be shifted. Admittedly, I have no conclu-
sive proof that this is so. Nor do I have a convincing argument that substan-
tive epistemological constraints cannot be found to settle the objects of
perception puzzle. A few thousand years of inconclusive debate on the topic
is perhaps my best evidence, and reports of the current state of the discussion
are not encouraging signs of progress.13 Could the real difficulty be that there
are, in fact, nineteen appropriate, unremarkable answers?
Qualms about the ontological status of objects of perception should not
impugn the value of countenancing phenomenal versions, nor undermine
attempts to determine the qualities, character, and orderings of experience.
Moreover, it is indisputable that distinct visual stimuli can trigger phenome-
nally indistinguishable experiences of space, and identical stimuli can be
triggered by an unbounded number of different environmental layouts. (See
chapter 11, figures 11.4, 11.5 and 11.6.)14 The situation is the same with color
perception. Surfaces and lights with quite different physical compositions
produce the same color experiences, and two items that look distinct in color
on some backgrounds will phenomenally match when placed against others.
Such many-one mappings are a pervasive feature of visual perception. There
is no need to appeal to hallucinations, illusions, or intrusive brain stimula-
tion by mad scientists to find cases.
Research on the nature and structure of appearances is a legitimate project,
and it is hard to make sense of much of it without type-identifying experi-
ences in terms of the phenomenal qualities these studies find useful. In addi-
tion, the most natural formal analysis of this appearance discourse is likely to
involve quantification over qualitative states or properties. Must such every-

day psychological and philosophical talk of subjective qualities, though, pro-
voke the sort of metaphysical dilemmas and epistemological puzzles Austin
wishes to debunk? I do not see why. These problems can gain traction, only
if unnecessary claims about “certainty,” “privacy,” and the “conceptual” are as-
sumed. And the pressure to solve them diminishes, as soon as overly demand-
ing materialist doctrines are put in question.15
Does countenancing phenomenal discourse and properties, nevertheless,
make us vulnerable to the threat of a veil of appearance standing between us
and the world? I am inclined to think that this quandary is bogus, and that it
is largely independent of the stance taken with regard to the objects of per-
ception. There cannot be a phenomenal veil that prevents seeing reality as it
just is, because there is nothing of this sort to see. But it will be argued that for
beliefs and theories to be objectively grounded it is necessary to assume that
there is a world untinged by subjectivity. Without access to a mind/response
independent world there is no way to constrain versions and fend off the dis-
asters of radical relativism and Idealism. But it would be impossible to con-
front this world directly, if as argued above, what is given to experience is
always a function of its taking? Such subjectivity of phenomenal experience
will place a veil between the perceiving mind and unadorned, untouched re-
ality. So weighty epistemological questions can not be avoided; they call for
answers. Austin warns us not to be lured by the call, as do the Pragmatists.
This and related skeptical worries rely on accepting tidy-looking metaphysi-
cal dualisms (for example, essential versus non-essential properties, immedi-
ate versus non-immediate experience, scheme versus content, and subjective
facts versus purely objective facts) there is no need to respect.16 Once more, I
am sympathetic to this Pragmatic/Austinian line. Perhaps all this shows is
that I too am missing the real point.
Notes
* I wish to thank the members of the UWM philosophy faculty workshop for com-
ments and spirited resistance.
1. “Intelligent Behavior: A Critical Review of The Concept of Mind” in Ryle O. Wood and
G. Pitcher (eds.), New York: Anchor Books, 1970.
2. J. L. Austin, New York: Oxford University Press 1964.
3. See M. Thau, “What is Disjunctivism?” Philosophical Studies 120, 193–253, 2004.

4. Although once the issue is probed much below the surface, problems do arise spelling
out the sense and implications of this everyday discourse. (See chapter 13.)
5. This is a position Thau (2004) explores.
6. For references and earlier discussion of this issue see R. N. Hanson’s chapter “Obser-
vation” in Patterns of Discovery, Cambridge: University of Cambridge Press, 1964.
7. I am not denying that linguistic practices and conceptual intuitions can be brought
to bear. I am questioning the significance and force of their verdicts in this case.
8. I deviate somewhat from Austin’s actual mirage discussion. He does not discuss mist
as a factor.
9. I leave aside disputes over the best way to characterize the notion “filling-in.”
10. Note, apparent motion type processes underlie experiences of movement in films,
but in most contexts it is not common to talk of these experiences as misperceptions.
11. For an interesting attempt to formulate a list of epistemological conditions of ade-

quacy, see S. Sturgeon, Matters of Mind, London: Routledge, 2000.
12. The issue raised is analogous to those long-discussed in visual theory concerning
the proper understanding of the notion of “stimulus.”
13. See L. Bonjour, “Epistemological Problems of Perception,” Stanford Encyclopedia of

Philosophy, http:/plato.stanford.edu/entries/perception-episprob.
14. Also see J. Koenderink, “Multiple Visual Worlds,” Perception 30, 2001, 1–7.
15. I do not deny that there are significant problems concerning consciousness that
can be and need to be addressed, along with “what it’s like” worries that need to be
defused.
16. James and Dewey do offer an alternative perspective—a pluralism of useful ver-
sions, none privileged and none representing Reality ready-made. More recently,
N. Goodman advocates such a position in Ways of Worldmaking, Indianapolis: Hackett
Publishing, 1978. I have developed ideas along this line in “I’m Going to Make You a
Star,” Midwest Studies in Philosophy 11, 1986, 427–39 and “Starting from Scratch: Mak-
ing Worlds,” Erkenntnis 52, 2000, 151–159.
Index
Alberti’s Window, 160, 178–179, 183 Sense and Sensibilia and, 241, 243–245
Armstrong, D. M., 18–19 veridical vision and, 248–249
Art Ayer, 241
Alberti’s Window and, 160, 178–179, 183
caricatures, 161, 163, 167, 169, 177–178 Benson, J., 200
Cubists, 151, 161, 163, 167, 169, 177, Berkeley, Bishop, 1, 11
179, 182 color and, 15–16
distortion and, 162 convergence and, 24
occlusion and, 109–110 critics on, 13–14
painting, 151, 160–162, 169–170, 178– dimensionality and, 18–19
179, 183 distance evaluation and, 14–15
photography, 162, 177 An Essay Towards a New Theory of Vision,
picture perception and, 3–4 (see also 2, 13–16, 19, 49, 67, 71–87
Picture perception) heterogeneity and, 55–67
projectivists and, 159–170 immediacy and, 14–17
realism and, 150–154 inference and, 2–3, 103
resemblance and, 148–151 inseparability thesis and, 65–66
station point and, 160–161, 181 intuition and, 18
symbolic paradigm and, 164–168, inverted image and, 18
173–185 Kantian approach and, 22–23
Atomic places, 40 Kaufman model and, 24–25
Auditory stimuli man born blind (MBB) test and, 71–87
heterogeneity and, 56–59 minima visibilia and, 40–49
man born blind (MBB) test and, 82 minimum sensibile and, 35, 37–50
simultaneous sounds and, 57 misunderstanding of, 13
Austin, J. L., 5 Molyneux problem and, 55, 62, 69, 71
Gibson and, 246–247 one-point argument and, 18–19
hallucination and, 247–248 psychic approach and, 15–16
object perception and, 243–253 size perception and, 29–33
real notion and, 245 smell and, 17–19
256 Index
Berkeley, Bishop (cont.) inference and, 97–98

stereoscopic experiments and, 19–25 minima visibilia and, 44–45
visual-motor correlation and, 24 minimum sensibile and, 38–39
Black, M., 144 Munsell condition and, 216, 218,
Blind spots, 248 223–224
Boring, E. G., 109, 111 object perception and, 194
Bower, T. G. R., 24 ontological perspective and, 227
Bransford, J., 134 phenomenal/physical versions and,
Brown, Richard, 209 236–240
Bruner, Jerome, 98 pluralistic perspectives on, 233–240
Bruno, N., 124 reductionism and, 236
Burton, G., 129 reflectance and, 214–227, 233–240
Byrne, A., 212 standards for, 216–217, 223–225
veridical vision and, 211, 233, 235,
Caricatures, 161, 163, 167, 169, 177–178 248–249
Carnap, R., 38 Colour Perception: Connecting the Mind to
Center for Interdisciplinary Research the World (Mausfeld & Heyer), 209
(ZiF), 209, 211 Common sensibile
Certainty, 252 distance and, 59–62
Cezanne, 181 heterogeneity and, 57–67
Child’s Conception of Reality, The (Piaget), minimum sensibile and, 55–57
199–200 number and, 58–59
Circularity, 110–111 shape and, 62–64
Clark, A., 219 size and, 59–62
Cognition, 6 Cubists,
directed perception and, 123–135 picture perception and, 151, 161, 163,
filling-in and, 248 167, 169, 177, 179, 182
haptic pictures and, 174 station point and, 182
inference and, 97, 100–104 surrogate models and, 177–179
object perception and, 191–207 Cue theory, 3, 107
picture perception and, 173–185 occlusion and, 115–116
(see also Picture perception) picture perception and, 163
replete judgment and, 174–176 projective geometry and, 23–25, 162
symbolic paradigm and, 173–185 resemblance and, 152
Cohen, M. M., 130 size perception and, 29–33
Color, 5, 17, 231 stereoscopic experiments and, 19–25
Berkeley on, 15–16 Cutting, James, 3
error and, 212–229, 233–240 directed perception and, 124–135
gray-scale and, 224–225, 233 inference and, 109–110, 121
heterogeneity and, 56, 66 occlusion and, 109–110
hue and, 212
identity and, 219, 236 Danto, A., 173
immediacy and, 16 Denotative reference, 143–145
Index 257
Depth perspective, 3 stereoscopic experiments and, 19–25

occlusion and, 109–120 taking-account-of-distance (TAD)
Deregowski, J. B., 148 model and, 29–33
Dewey, John, 1, 243 two-dimensional spatiality and, 18–19
Dimensionality Donagan, Alan, 13
Berkeley and, 18–25 “Dual Coding of Colour, The” (Maus-
distance and, 18–19 feld), 209
Kantian approach and, 22–23 Duck/rabbit picture, 93
minima visibilia and, 48–49
projectivists and, 162 Egyptians, 161, 163, 166–167, 177, 182
resemblance and, 148 Empiricism, 22, 96
stereoscopic experiments and, 14, 19–25 Enumeration, 58–59
symbolic paradigm and, 164–168 Epstein, W., 42, 128
Directed perception Error
cognitive processing and, 129 color and, 212–229, 233–240
Cutting and, 124–135 existence of, 211
empirical analysis and, 124–125, gray-scale and, 224–225, 233
130–132 ideal conditions and, 223–224
fuzzy logic and, 131 identity and, 219
Gibson and, 121, 124–135 individualistic conceptions and, 219–220
indirect perception and, 123–130 of look, 215–222
inductive conclusion and, 126 measurement and, 238–240
inference and, 125–127 objectivity and, 236–237
information form and, 123, 125–133 ontological perspective and, 212, 227
kinematic, 130–131 Pragmatism and, 239–240
learning and, 121, 125–127 Realists and, 238–240
mathematics and, 124 reflectance and, 214–227, 233–240
metaphysics and, 132–133 relativism of, 212
premises and, 125–127 reliable methods and, 222
stimulus adequacy and, 127–128, solipsism and, 219–220
132–133 standards for, 216–217, 223–225
taking-account models and, 128 subjectivity and, 221–222, 236–237
Distance perception, 249 terminology for, 213–214
Berkeley on, 14–15 veridical vision and, 211, 233, 235
convergence and, 24 viewing condition and, 218–219
heterogeneity and, 59–62, 67 ZiF group and, 209, 211
immediacy and, 16–25 Essay Towards a New Theory of Vision, An
man born blind (MBB) test and, 72, 83, 85 (Berkeley), 2, 13, 49, 67
minimum sensibile and, 40 distance perception and, 15
occlusion and, 109–120 immediate ideas and, 14–16
projective geometry and, 23–25 man born blind (MBB) test and, 71–87
resemblance and, 152 one-point argument and, 19
size perception and, 29–33 Evans, Gareth, 80–84
258 Index
Falkenstein, L., 42 Haptic pictures, 174

Filling-in, 248 Hatfield, G., 42
Fodor, J., 134 Hecht, H., 131
Fuzzy logic, 131 Helmholtz, Herman von, 13–14, 21,
96, 126
Geometry Hering, Ewald, 22
Alberti’s Window and, 160, 178–179, 183 Heterogeneity
heterogeneity and, 62–65 auditory stimuli and, 56–59
minima visibilia and, 41–42 color and, 56, 66
occlusion and, 111–114 distance and, 59–62, 67
shape and, 62–65, 249 doctrine of, 57–58
station point and, 160–161, 178–181 extension and, 65–66
symbolic paradigm and, 180 geometry and, 63
Gestaltists, 16, 175 inseparability thesis and, 65–66
Gibson, E. J., 121 interpretation issues and, 64–66
Gibson, James J., 3, 16–17 minimum sensibile and, 55–57, 60–61,
directed perception and, 121, 123, 65–66
125–135 Molyneux problem and, 55, 62, 69, 71
inference and, 94–95 number and, 58–59
object perception and, 246–247 one-ness and, 58–59
occlusion and, 111, 118 phenomenal location and, 56
picture perception and, 163 shape and, 62–65
station point and, 180 size and, 59–62
Gilchrist, Alan, 209, 211, 213–214, 216 smell and, 56
Gilden, D. L., 130–132 spatial perception and, 57, 66–67
Given notion, 98–99 tactile stimuli and, 57–58, 63–64, 66
Gleichschaltung, 244 tangibilia and, 56–57, 60–66
God, 48, 73 visibilia and, 56–57
Gombrich, E. H., 162 Heyer, D., 209
Goodman, Nelson, 1, 4 Hilbert, D. R., 212
error and, 219, 239 Hochberg, Julian, 13, 152
minimum sensibile and, 38, 43–44 Hudson, 148
picture perception and, 159, 170,
173, 176 Idealism, 37, 237–238
representation paradigm of, 164–168 Identity, 202–204, 219, 236
resemblance and, 151 Illusion
symbol systems and, 174–175 color errors and, 213–228
Gray-scale perception, 224–225, 233 error and, 233–240
Greeks, 166 moon and, 76–77
occlusion and, 109–120
Haith, M., 200 Immediacy
Haitians, 177, 182 color and, 16
Hallucinations, 93–94, 247–248 distance and, 16–25
Index 259
Kantian approach and, 22–23 object perception and, 191–207

man born blind (MBB) test and, 74–75, 77 representation paradigm and, 164–168
size perception and, 30 robustness for, 165–166
stereoscopic experiments and, 14, 19–25 symbolic paradigm and, 173–185
Immediate ideas, 14–17 Languages of Art (Goodman), 4, 151
Impoverished stimulus, 96–97 symbolic paradigm and, 164–168
Inductive conclusion, 126 Learning
Inference, 2–3 inference and, 96
abandoning of, 98–99 man born blind (MBB) test and, 72–78
cognitive states and, 97, 100–104 resemblance and, 146–154
color and, 97–98 Leibniz, W. G., 72
conscious manipulation and, 101 Levine, M., 110
directed perception and, 121, 125–127 Light, 157
dissolution and, 100 Alberti’s Window and, 160, 178–179, 183
Empiricism and, 96 art and, 150–151
epistemological approaches and, 97–98 color errors and, 213–228
Helmholtz and, 96 directed perception and, 123–135
idea of the given and, 98–99 filling-in and, 248
impoverished stimulus and, 96–97 inference and, 97–98
intellectual, 103–104 lightness blind and, 219–220
learning and, 96 object perception and, 191–207
mental operations and, 97, 100–104 occlusion and, 109–120
Nativism and, 96 opaque body interception and, 109–110
psychology and, 97–104 projectivists and, 159–164
sensory state and, 95, 99 resemblance and, 150–152
supplementation and, 99 as retinal stimulus, 93
Inseparability thesis, 65–66 station point and, 160–161, 178–182
Inverted images, 18, 75–76, 79–80 subjectivity and, 221–222
symbolic paradigm and, 178–182
James, William, 1, 22, 243 Locke, J., 40, 72
Jesseph, D., 37 Luce, A. A., 37–38, 40
Journal of Philosophy, 153
Mach, E., 72
Kantian approach, 22–23, 238 Man born blind (MBB) test
Kaufman, Lloyd, 27, 29, 109 auditory stimuli and, 82
model of, 24–25 Berkeley and, 85–87
taking-account-of-distance approach distance and, 72, 83, 85
and, 31–32 Evans and, 80–84
Kellman, P., 111, 190 immediacy and, 74–75, 77
initial experience and, 73–74
Language, 169–170 inverted images and, 75–76, 79–80
cognitive reading and, 171 learning and, 72–78
interpretation and, 184 Leibniz and, 72
260 Index
Man born blind (MBB) test (cont.) geometric points and, 41–42
Mach and, 72 heterogeneity and, 55–57, 60–61,
Mill and, 72–73 65–66
necessary connections and, 72–73 judgment and, 38–39
olfactory stimuli and, 74 metrics for, 39–40
perspective and, 76 minima visibilia and, 40–49
phenomenal ordering and, 74–75 orientation and, 41
Schwartz and, 84–85 role of, 40
shape and, 83–87 sensory qualities and, 38
size and, 83 smell and, 38
spatial perception and, 75–78 spatial perception and, 41
tactile stimuli and, 74–75 tangibilia and, 38
thought experiments and, 71–72 taste and, 38
Marr, David, 126, 130, 247 Mirages, 247–248
Massaro, D. W., 130–131 Moked, G., 37
Mathematics, 41–42, 55, 60. See also Molyneux problem, 55, 62, 69
Geometry Evans and, 80–84
directed perception and, 124 man born blind (MBB) test and, 71–87
heterogeneity and, 58–59 Mona Lisa, 181
object perception and, 203 Moon illusion, 27, 76–77
Mausfeld, R., 209 Movement, 93, 181–183
Metaphysics, 132–133, 227 Munsell condition, 216, 218, 223–224
Mill, J. S., 72–73 Music, 166, 174
Minima tangibilia, 60 heterogeneity and, 58–59
Minima visibilia resemblance and, 145–147
characterization of, 40 symbol systems and, 145–147, 166, 174
color and, 44–45 Mystery of the Moon Illusion, The: Explor-
dimensionality and, 48–49 ing Size Perception (Ross & Plug), 27
experience and, 44–47
geometry and, 41–42 Nativism, 96
heterogeneity and, 60–62 Necker cube, 93
intersubjective comparisons and, 47 Number, 58–59
phenomenal location and, 42–43
shape and, 43–44 Oberlin Colloquium in Philosophy, 241
size perception and, 45–47 Object perception, 5
Minimum sensibile, 35 animals and, 193–194
audition and, 38 Austin and, 243–253
characterization of, 40 body concept and, 193–200
color and, 38–39 causality and, 250–251
conceptualizing of, 37–38 color and, 194
distance and, 40 computational tasks and, 192–193
experience threshold and, 41 constancy for, 198–199
field magnitude and, 39–40 debate over, 241
Index 261
developmental perspective and, 197–198 Painting, 151, 161–162, 181

encoding and, 197, 205 Alberti’s Window and, 160, 178–179, 183
filling-in and, 248 trompe l’oiel, 178, 183
hallucination and, 247–248 Palmer, 109, 111
identity and, 202–204, 206 Perception
infant experiments and, 191, 199–202 appearance and, 2
objecthood concept and, 191–207 Berkeley and, 1–2, 11–26, 29–33
occlusion and, 109–120 color and, 15–16, 224–225 (see also
ontological objects and, 189–195 Color)
Piaget and, 199–201, 206 convergence and, 24
primal sketch and, 247 cue theory and, 162
Quine and, 189, 192–195, 205–206, depth, 3, 109–120
207n7 dimensionality and, 18–19
resemblance and, 143–155 directed, 121–135
symbolic logic and, 205 distance, 14–17 (see also Distance
veridical vision and, 248–249 perception)
Occlusion, 3, 107, 120 error and, 211–228, 233–240
art and, 109–110 filling-in and, 248
circularity and, 110–111 fuzzy logic and, 131
completeness and, 114–115 hallucination and, 247–248
cue theory and, 115–116 immediacy and, 14–17
geometry and, 111–114 inference and, 2–3, 95–105
Gibson and, 111, 118 inverted image and, 18, 75–76, 79–80
interpolation and, 117 left-right ideas and, 18
interposition and, 111–114 man born blind (MBB) test and, 71–87
judgment of, 110–111 minima visibilia and, 40–49
observer’s angle and, 114 minimum sensibile and, 35
opaque body interception and, 109–110 object, 5, 191–207, 243–253
optical analysis of, 111–119 occlusion and, 109–120
supplementation and, 117–118 olfactory, 17–19, 38, 56, 74
visible surfaces and, 117 one-point argument and, 18–19
Olfactory stimuli, 17–19 picture, 3–4 (see also Picture perception)
heterogeneity and, 56 primal sketch and, 247
immediacy and, 74 projectivists and, 159–164
man born blind (MBB) test and, 74 pure, 162–163
minimum sensibile and, 38 reality and, 2, 5–6
Ontology size, 29–33
Austin and, 243–253 spatial, 3, 16 (see also Spatial perception)
error and, 227 stereoscopic experiments and, 14, 19–25
object perception and, 189–195 Perception of the Visual World, The
Optics. See Light (Gibson), 95, 111
Optic writers, 31 “Perceptual Learning: Differentiation or
Orientation, 41 Enrichment” (Gibson & Gibson), 121
262 Index
Perspective movement and, 181–183

Alberti’s Window and, 160, 178–179, 183 projectivists and, 159–170
Berkeleian, 1–2, 11–26, 29–33 realism and, 150–154
distortion and, 181–182 replete judgment and, 174–176
inference and, 95–105 representation paradigm and, 173–185
linear, 181 research state in, 168–170
man born blind (MBB) test and, 76 resemblance and, 143–155
projectivists and, 159–164 robustness for, 162–165, 181–182
spheres and, 181 spheres and, 181
Phenomenology, 35 station point and, 160–161, 178–182
measurement and, 237–238 surrogate models and, 176–181
reflectance and, 233–240 symbolic paradigm and, 164–168,
subjective/objective distinction and, 173–185
236–237 taking-account models and, 163
Philosophical Commentaries (Berkeley), transfer and, 149–154
37–42, 60 viewer location and, 178–181
Philosophical Studies, 241 visuality and, 173–174
Philosophy, 176 Pirenne, M., 151, 181
Austin and, 243–253 Pitcher, George, 13–14
Berkeley and, 13–25, 37–38 (see also Pittenger, J. B., 129
Berkeley, Bishop) Plato, 246, 248
immediacy and, 14–17 Plug, C., 27
inference and, 97–98 Pluralism, 231
inseparability thesis and, 65–66 error and, 233–240
picture perception and, 3–4 (see also “Power of Pictures, The” (Schwartz), 153
Picture perception) Pragmatism, 239–240
Plato and, 246, 248 Privacy, 252
reality and, 2, 5–6 Proffit, D. R., 130
sensory states and, 13–14 Projectivists, 159, 170
stereoscopic experiments and, 14, 19–25 cue theory and, 162–163
Photography, 162, 177 dimensionality and, 162
Photo-realists, 179 distortion and, 162
Piaget, J., 199–201, 206 innocent eye and, 169
Picasso, 151 pure perception and, 162–163
Picture perception, 3–4, 141 robustness for, 162–164
Alberti’s Window and, 160, 178–179, 183 station point and, 160–161
cue theory and, 162–163 symbolic paradigm and, 164–168
distortion and, 162, 181–182 taking-account models and, 163
Gestalt switch and, 175 Psychological Review, 121
haptic pictures and, 174 Psychology
innocent eye and, 169 inference and, 97–104
learning and, 147–151 minimum sensibile and, 37–50
linear perspective and, 181 object perception and, 189, 191–207
Index 263
picture perception and, 3–4, 143–155, Resemblance, 4, 157

159–170 arbitrariness and, 146–147
projectivists and, 159–170 art and, 148–151
reality and, 2, 5–6 cue theory and, 152
resemblance and, 143–155 denotative reference and, 143–145
symbolic paradigm and, 164–168, dimensionality and, 148
173–185 distance and, 152
Psychophysics, 233 imitation and, 144
color and, 233–240 (see also Color) independent criteria and, 143–144
error and, 211–229 learning and, 146–154
reductionism and, 236 light and, 150–152
Pylyshyn, Z., 134 likeness and, 144
music notation and, 145–147
Quine, W. V. picture perception and, 143–155
logical statement form and, 247 realism and, 150–152
object perception and, 189, 192–195, traditional approach to, 144–148
205–206, 207n7, 247 transfer and, 149–154
Reversible figures, 93
Realism Richards, Whitman, 191
Alberti’s Window and, 160, 178–179, 183 Rock, Irvin, 27, 29, 31–32, 128
art and, 150–154 Rogers, S., 180
Austin and, 245 Rosen, R., 132
measurement and, 238–240 Ross, H., 27
station point and, 160–161 Runeson, S., 129–131
Reality, 2, 5–6 Russell, Bertrand, 14
immediate ideas and, 14–17
inference and, 97–98 Scheffler, I., 247
measurement and, 237–238 Schwartz, R., 190, 206n6, 207n7, 227
mirages and, 247–248 homogeneity and, 60
object perception and, 189–207 inference and, 126
ontological objects and, 189–195 man born blind (MBB) test and, 84–85
size perception and, 29–33 minimum sensibile and, 42, 48
taking-account-of-distance (TAD) picture perception and, 153, 168–169,
model and, 29–33 174, 177
Reductionism, 236 Scientific American, 29
Reflectance Sedgwick, H. A., 129
error and, 214–227, 233–240 Sense and Sensibilia (Austin), 241,
pluralist perspective and, 233–240 243–245
Rembrandt, 151 Shape, 249
Representation heterogeneity and, 62–65
comparative judgment and, 143–144 man born blind (MBB) test and, 83–87
denotative reference and, 143–145 Shaw, R., 134
resemblance and, 143–155 Shefner, J., 110
264 Index
Shepperson, B., 198 immediacy and, 19–25

Shipley, E., 198 Kantian approach and, 22–23
Shipley, T., 111, 190 qualitative assessment and, 22
Size perception, 107 Stewart, Justice, 190
heterogeneity and, 59–62 Stiles, W. S., 213
illusion in, 32–33 Stroffregen, T., 129
man born blind (MBB) test and, 83 Stumpf, Carl, 22
minima visibilia and, 45–47 Subjective contours, 93
moon illusion and, 76–77 Sully, James, 14
optic writers and, 31 Supplementation, 99, 107, 117–118
retinal angle and, 29–30 Surrogate models, 176–181
taking-account-of-distance (TAD) Symbolic logic, 205
model and, 29–33 Symbol systems, 4
Solipsism, 219–220 Alberti’s Window and, 160, 178–179, 183
Spatial perception, 3 animals and, 174
Berkeley on, 16 arbitrariness and, 146–147
dimensionality and, 22–23 caricatures, 161, 163, 167, 169, 177–178
distance and, 16–25 cue theory and, 162–163
heterogeneity and, 57, 66–67 Gestalt switch and, 175
inverted image and, 18 Goodman’s representation paradigm
Kantian approach to, 22 and, 164–168
layout issues and, 174–176 haptic pictures and, 174
left-right ideas and, 18 language and, 173 (see also Language)
man born blind (MBB) test and, 75–78 mimetic representation and, 180–182
metric for, 22–23 movement and, 181–183
minima visibilia and, 40–49 music notation and, 145–147, 166, 174
minimum sensibile and, 41 picture perception and, 143–155,
occlusion and, 109–120 159–170
one-point argument and, 18 projectivists and, 159–170
picture perception and, 175–176 replete judgment and, 174–176
(see also Picture perception) representation paradigm and, 173–185
projective geometry and, 23–25 resemblance and, 143–155
replete judgment and, 174–176 robustness for, 162–165
stereoscopic experiments and, 14, 19–25 station point and, 181–182
symbolic paradigm and, 173–185 surrogate models and, 176–181
Spheres, 181 symbolic paradigm and, 164–168,
Spirits, 47–48 173–185
Station point, 160–161 visuality and, 173–174
converse, 181
Gibson and, 180 Tactile stimuli
symbolic paradigm and, 178–182 heterogeneity and, 56–58, 63–64, 66
Stereoscopic experiments, 14 man born blind (MBB) test and, 74–75
empiricist approach and, 22 phenomenal location and, 56–57
Index 265
Taking-account-of-distance (TAD) occlusion and, 109–120

model, 29–33, 128, 163 picture perception and, 143–155,
Tangibilia 159–170
heterogeneity and, 56–57, 60–66 qualitative aspects of, 5–6
inseparability thesis and, 65–66 representation paradigm and, 173–185
man born blind (MBB) test and, 71–87 resemblance and, 143–155
minima, 60 retinal angle and, 29–30
shape and, 62–64 sensory state and, 95
Taste, 74 size perception and, 29–33
Thau, Michael, 241 stereoscopic experiments and, 19–25
Theatetus (Plato), 246 subjectivity and, 2, 5–6
Theory of Vision Vindicated (Berkeley), symbolic paradigm and, 173–185
74, 76 veridical, 211, 233, 235, 248–249
Transfer, 149–154 Visions, 93–94
Trompe l’oeil paintings, 178, 183 Vision: Variation on some Berkeleian
Turvey, M. T., 129 Themes (Schwartz), 1–3, 11
inference and, 93
Ullman, S., 129 occlusion and, 107, 118–119
University of Bielefeld, 209 size and, 27
Van Gogh, V., 181 “Way the World Is, The” (Goodman), 164
Vedeler, D., 130 “What is Disjunctivism?” (Thau), 241
Vishton, P., 109–110 Wheatstone’s sterescope, 14, 19–25
Visibilia Whittle, Paul, 209
heterogeneity and, 56–57, 60–62 Wollheim, R., 176
minima, 44–47 Wyszecki, G., 213
phenomenal location and, 56
tangibilia and, 56–57 ZiF. See Center for Interdisciplinary
Vision, 1 Research (ZiF)
Berkeley and, 11–26 (see also Berkeley,
Bishop)
blind spots and, 248
directed perception and, 123–135
error and, 211–228
filling-in and, 248
hallucination and, 247–248
inference and, 95–105
inverted images, 18, 75–76, 79–80
man born blind (MBB) test and, 71–87
minima visibilia and, 40–49
minimum sensibile and, 38
object perception and, 191–207, 243–
253 (see also Object perception)

Schwartz R. Visual Versions PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Schwartz R. Visual Versions PDF

Uploaded by

Copyright:

Available Formats

Md. Dalim #851746 5/11/06 D.G.

Yellow ProCyan Black

The MIT Press

Library of Congress Cataloging-in-Publication Data

Schwartz, Robert, 1940–

I Berkeleian View of Vision 9

1 Seeing Distance from a Berkeleian Perspective 13

3 Making Maximum Sense of “Minimum Sensibile” 37

4 Heterogeneity and the Senses 55

5 What Berkeley Sees in the Man Born Blind 71

6 The Role of Inference in Vision 95

7 Making Occlusion More Transparent 109

8 Directed Perception 123

III Picture Perception 139

9 Representation and Resemblance 143

10 Pictures, Puzzles, and Paradigms 159

11 Vision and Cognition in Picture Perception 173

IV Missing the Real Point 187

12 The Concept of an “Object” in Perception and Cognition 191

13 Avoiding Errors about Error 211

14 Pluralist Perspectives on Perceptual Error 233

15 An Austinian Look at the “Objects of Perception” 243

1. “Seeing Distance from a Berkeleian Perspective,” in Berkeley’s Metaphysics:

Section I: The Berkeleian Perspective

As argued in my book, Vision: Variation on some Berkeleian Themes (VVBT), I

knowledge, and tools enabling a better understanding of both the problems

Section II: Perceptual Inference

tion, not calculative reasoning. In later work, Berkeley recommends using

Section III: Picture Perception

The ability to understand and appreciate pictorial representations raises is-

scholars in a number of related ﬁelds. Interpreting pictures, it is said, is quite

Section IV: Missing the Real Point

3. Any account of the qualitative aspects of visual phenomena must in the

Each of these assumptions is at odds with this volume’s pluralist, irrealist

1. Goodman’s paper “Words, Works, Worlds” is a concise, trenchant statement of this

3. Berkeley’s account is quite similar to the one found in H. Helmholtz. Helmholtz is

4. For some recent papers, see Hecht et. al. 2003.

Goodman, N. (1977) The Structure of Appearance. Indianapolis: Bobbs-Merrill.

Gibson, J. J. (1950) The Perception of the Visual World. Boston: Houghton-Mifﬂin.

———. (1985) “The Power of Pictures.” Journal of Philosophy 82: 711–720.

———. (1985) “Review of D. Marr, Vision.” Philosophical Review 94: 411–414.

———. (2004) Perception. Oxford: Blackwell Publishing.

This paper surveys ideas developed further in chapter 1 of VVBT. It explains

Although Berkeley’s An Essay Towards a New Theory of Vision contains a prob-

and Julian Hochberg,

Donagan, along with numerous other commentators, is convinced that the

And here is Helmholtz:

But James Sully demurs:

Russell has been joined by other critics in citing Wheatstone’s invention of

As for the status of distance perception, the one-point argument convinced

Intuitively, however, vision seems different from smell; there appears to be

stereoscope experiments were seen to support Berkeley’s thesis about the

Still, in order to evaluate absolute distance it is not enough to have a cue K

Now, although Berkeley might have qualms taking Kaufman’s equations to

1. “Berkeley’s Theory of the Immediate Objects of Vision,” in Studies in Perception,

2. Perception (Englewood Cliffs, N.J.: Prentice-Hall, 1965), p. 43.

3. Berkeley (London: Routledge & Kegan Paul, 1977), p. 97.

6. “The Question of Visual Perception in Germany, I,” Mind 9 (1878), p. 1.

9. Berkeley’s Theory of Vision (Melbourne: Melbourne University Press, 1960), p. 6.

18. Development in Infancy (San Francisco: W. H. Freeman, 1974), pp. 75–76.

20. Ibid, p. 226.

1. Berkeley, New Theory, sect. 52.

2. Margaret Atherton’s Berkeley’s Revolution in Vision (Cornell University Press, Ithaca,