Professional Documents
Culture Documents
Robert Schwartz
A Bradford Book
All rights reserved. No part of this book may be reproduced in any form by any elec-
tronic or mechanical means (including photocopying, recording, or information stor-
age and retrieval) without permission in writing from the publisher.
MIT Press books may be purchased at special quantity discounts for business or sales
promotional use. For information, please email special_sales@mitpress.mit.edu or
write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA
02142.
This book was set in Stone Sans and Stone Serif by Graphic Composition, Inc. and was
printed and bound in the United States of America.
B846.S39 2006
121'.35—dc22
2006041931
10 9 8 7 6 5 4 3 2 1
To my brother Jerold, for his constant love and support.
An unrivaled sibling.
Contents
Sources ix
Preface xi
Introduction 1
2 Size 29
II Inference 91
Index 255
Sources
13. “Avoiding Error About Error,” in Colour Perception: From Light to Object,
eds. R. Mausfeld and D. Heyer, Oxford University Press, 2004.
14. “Pluralist Perspectives on Perceptual Error,” Pluralism: Theory of Knowl-
edge, Ethics, and Politics, eds. G. Abel and H.J. Sandkühler, Meiner Pub., 1996.
15. “An Austinian Look at the ‘Objects of Perception,’” unpublished.
Preface
The writings contained in this volume are all on topics in the theory of vision.
Five are new. The remaining selections have appeared in print, albeit a num-
ber in conference proceedings or volumes not readily accessible. In addition,
many are in books or journals tending to have a readership of either phi-
losophers or psychologists, but not both. I hope this collection can bridge
these gaps. Brief excerpts from my book, Vision: Variations on some Berkeleian
Themes, and two articles sketching ideas further explored there are reprinted.
These excerpts and papers provide both continuity and background to other
readings.
All the published essays appear without significant changes. Citation in-
formation, now available, is given, and typographic errors are corrected,
when caught. Any additions to the published works are indicated with an as-
terisk, and the new material appears in brackets, *[ ]. Each selection begins
with a prescript intended to set the context for the selection. The prescripts
were not part of the original works. Most of the non-historical papers written
since 1996 stem from a project on perception sponsored by the Center for
Interdisciplinary Research (ZiF) at the University of Bielefeld. I gained much
from the year of continuous discussions and debates with members of the
project. It was also a lot of fun.
Acknowledgments of help are found in the individual selections. They do
not reflect, however, my almost weekly conversations with Sidney Morgen-
besser and the insights and enlightenment he offered. The result of his prob-
ing questions and challenges show up in many of the essays. I also profited
from Sidney’s incredible storehouse of knowledge. I never had to rely on
search engines to find references germane to my interests. Sidney was much
more effective. I will miss his philosophical acumen and so much more his
friendship.
Introduction*
The selections in this volume are grouped into four sections. By and large the
pieces can be read independently. There are, however, issues and arguments
that cut across these boundaries. There are, too, commonalities of concern
and approach that run throughout the collection. An overarching commit-
ment to pluralism and irrealism along the lines of William James, John
Dewey, and especially Nelson Goodman is presumed but not explicitly de-
fended.1 I think the advantages of adopting these stances in the study of
vision are considerable. Ill-imagined problems can be ignored, and fruitless
controversies avoided. The assumptions and intuitions they depend on are
fragile. Confronted with conflicting empirical evidence or theoretical needs,
conceptual certainties either crumble or are rendered irrelevant to substan-
tive issues.
The title of this collection, Visions and Versions, is meant to indicate the
cross-currents among the selections and the philosophical presuppositions
that underlie them. The following short summaries of the sections attempt to
highlight these matters.
The view that perception is inferential, and thus indirect, has a long history,
and debate about it has not died down. Berkeley’s position on perceptual in-
ference is obscured by a terminological ambiguity in his writings. Berkeley
appears first to accept and then to reject the claim that vision is inferential.
But the notion of inference he initially countenances is inductive associa-
Introduction 3
The appearance of the word “real” or its cognates is a sure sign of trouble. Qual-
ifying a property or formulating a problem in these terms tends to turn reason-
able issues into metaphysical quagmires. True, confronting such conundrums
does make it appear that deeper matters are being engaged. However, more
often than not, the more metaphysically a topic develops the less focused it
becomes. The issue at stake grows unclear and distanced from empirical and
theoretical considerations that can reign in philosophical intuitions. Ques-
tions that start out with real substance are replaced with pursuits resisting
closure. Unfortunately, enticed by the seeming foundational significance of
the real, perceptual psychologists often join the metaphysical quests. In dif-
ferent ways, the selections in this section urge both philosophers and psy-
chologists to resist the temptation.
A common response to the views expressed in section IV is that they miss
the real points of the debates. I do not deny the charge. The central aim of
these papers is to question the statement, empirical content, and, at times,
coherence of the supposed issues, as well as raise concerns about the ground
rules for arguing and settling them.6
Chapter 12 looks at recent psychological research on object perception. It
would seem that a prerequisite for such studies is having a reasonably precise
notion of an object to structure the research. But the notion of an object em-
ployed in much of this work does not seem up to the tasks assigned it. Chapter
13 and chapter 14 examine empirical research and theoretical positions that
depend on assumptions about the essential nature of color or what colors really
are. Reservations are expressed with both the content and goal of these projects.
The first essay in this section, chapter 12, explores issues germane to psycholog-
ical studies of object perception. It is fitting that the final selection returns to the
old chestnut of the philosophy of perception. “What are the real objects of per-
ception?” Austin’s attempt to dissolve the problematic is clarified and endorsed.
A number of assumptions are primarily responsible for problems examined
in this section:
1. Visual experience is subjective; thus it can only present the world as it ap-
pears, not as it is.
2. There are objective versions of the world that do capture reality in its
mind-independent, ready-made form. One version of these physics, is basic
and privileged.
6 Introduction
Notes
* Historical and contemporary materials on the topics discussed in this collection can
be found in my anthology, Perception. Unless otherwise indicated in the selection, all
references to Berkeley’s writings are to be found in The Works of George Berkeley, Bishop
of Clone (9 vols), eds. A. A. Luce & T. E. Jessop, Edinburgh: Thomas Nelson, 1948–57.
2. I have not included in this volume efforts to integrate Berkeley’s theory of vision
with his Idealism and related philosophical concerns.
5. For an account of how pictorial representations can effect conceptions of the world,
see Schwartz 1985. In battles over the role of imagery in cognition, images are usu-
ally identified with pictures and said to function like pictorial representations. I dis-
cuss the implications of the symbolic paradigm for issues concerning this analysis in
Schwartz 1982.
6. In a review of David Marr’s book, Vision, (Schwartz 1985) I mention related reserva-
tions with certain philosophical uses made of his ideas.
Introduction 7
References
———. (1978) “Words, Works and Worlds” in Ways of Worldmaking. 1–22. Indianapolis:
Hackett Publishing.
Hecht, H., R. Schwartz, and M. Atherton (eds.) (2003). Looking into Pictures. Cambridge:
MIT Press.
Schwartz, R. (1982) “Imagery: There’s More to it than Meets the Eye,” in Imagery,
N. Block (ed.), 109–29. Cambridge: MIT Press.
———. (1986) “I’m Going to Make You a Star.” Midwest Studies in Philosophy 11: 427–439.
———. (1994) Vision: Variations on some Berkelean Themes. Oxford: Blackwell Publishing.
———. (2000) “Starting From Scratch: Making Worlds.” Erkenntnis 52: 151–159.
The most influential theory of space perception in Western thought has been that dis-
tance is not a direct visual sensation at all. Instead . . . memories of the grasping or
walking motions that have been made in the past . . . provide the idea of distance.2
We are not in the habit of observing our sensations accurately. . . . Thus in most cases
some special assistance and training are needed in order to observe these subjective
sensations.4
Pitcher is right when he says that many philosophers have taken it as ax-
iomatic that we have incorrigible knowledge of our sensory states. But Helm-
holtz’s account of our ability to report on our sense experience better reflects
the position of most visual theorists working in the Berkeleian tradition, in-
cluding, I would argue, Berkeley himself. The next quotations provide an-
other striking case of conflicting viewpoints. Bertrand Russell insists that
Berkeley’s theory of vision, according to which everything looks flat, is disproved by
the stereoscope.5
Some years ago it was commonly thought that, thanks to the argument of the Berke-
leyans, aided by experiments of Wheatstone and others, the derivative nature of visual
space was amply demonstrated.6
underlie immediate ideas are, on this score, like those that underlie the out-
put of our kidney or liver; they are entirely organic or physiological in nature.
In much of the literature on vision, what Berkeley calls “immediate ideas” are
also referred to as “sensations.”
Berkeley’s own version of what makes a process mental is closely tied to the
then long prevalent identification of mental states with conscious states.
Mental processes were understood to involve manipulating ideas, which were
themselves assumed to be states of consciousness. In particular, then, the claim
that we do not see distance immediately amounts to the claim that the ideas
of distance, derived from sight, depend on mental operations; that is, they are
brought to mind via intermediate ideas.
As Berkeley notes, the claim that distance evaluation depends in this way
on the registering of pictorial and other cues was widely accepted. It was
thought to be a trivial consequence of the one-point argument. “For distance
being a line directed endwise to the eye, it projects only one point in the fund
of the eye, which point remains invariably the same, whether the distance is
larger or smaller” (NTV 2). But if distance perception is not immediate, which
aspects of vision might fall under the label “immediate”? Here matters have
been hotly contested throughout the modern history of visual studies. It
might seem, for example, that color or neutral color (the black-to-white scale)
are obvious candidates. What color or neutral color we perceive is simply de-
termined by the interplay between the properties of light and the physiolog-
ical nature of our visual receptors. No mental work is needed. Yet this sort of
explanation has its problems.
A piece of coal in sunlight looks black, while a lump of sugar indoors looks
white. The sunlit coal, however, reflects more white light than the sugar.
Treating such phenomena as sensations may thus seem problematic, since
there is no direct correlation between the stimulus intensity and the experi-
enced quality. Roughly, two types of theories have been offered to explain the
phenomena. On the psychic, or cognitive, theory, it is claimed that we im-
mediately experience a sensation that corresponds to the absolute value
of the light. The coal immediately appears white. But then our visual system
takes into account the high level of illumination. This combination of infor-
mation triggers a memory trace of a black quale, which we then experience.
The alternative approach claims that no such mental operations are neces-
sary. According to this view, the stimulus is not the absolute intensity of the
light but the ratio of the light intensities coming from the object and those in
16 Berkeleian View of Vision
its environment. The constant black color of the coal under different illumi-
nation is determined by the constant intensity of the ratios of the stimuli. It is
immediate, a matter of sense.
Similar conflicting approaches turn up in discussions of size and other spa-
tial properties. Consider the moon illusion. Although the size of the retinal
image of the moon is the same at its zenith and on the horizon, the moon
seems bigger on the horizon. For Berkeley the number of minimum visibilia
are the same, but we read through our immediate ideas and see the moon
differently in the two situations. In recent years, critics of this psychic ap-
proach, most prominently Gestaltists and Gibsonians, have argued that the
visual appreciation of size is simply triggered by higher-order properties of
the stimulus and is not dependent on intermediate sensations of the sort
Berkeley and others propose. Examples of these contrasting approaches, psy-
chic versus organic, could be multiplied, but this is no place to consider the
merits of each.7
If Berkeley’s use of the distinction between immediate and nonimmediate
ideas is continuous with that characteristic of work on vision both before
and after the New Theory, it might best be understood to incorporate the fol-
lowing features:
(1) Immediacy depends on the type of processing involved, not on the kind
of idea. Even to sight, certain cases of color perception, for example, need not
be immediate.
(2) The “immediate” notion does not match up with our ordinary-language
“looks,” “appears,” and “seems” locutions. The sunlit coal looks black and the
moon appears bigger on the horizon, but neither is immediate according to
psychic theories.
(3) What is immediately seen does not correspond to judgments that are
noncommittal regarding how things actually are in the world. We can protect
against factual error by claiming that the cat seems to be three feet away and
not asserting that it is three feet away, just as we can avoid commitment to the
real color of the fire engine by saying only that it looks red. Nevertheless, the
red look for Berkeley is immediate, but the three-feet-awayness is not. And
on the classic accounts of neutral color and size we are not reporting what we
immediately see when we speak guardedly and only say that the sunlit coal
seems to me to be black or the moon seems to me bigger on the horizon.
(4) Immediate ideas of sense did not typically have the epistemological status
they took on in twentieth-century philosophical discussions of the founda-
Seeing Distance from a Berkeleian Perspective 17
tions of knowledge and the mind/body problem. For Berkeley, as well as later
theorists, although our immediate experiences are mental states, we are not
necessarily able to report accurately on them, and they are not incorrigible.
and surfaces must be surfaces of volumes, and volumes are three dimensional.
Now Berkeley denies that objects are immediately seen as three dimensional,
and so he must deny they are seen flat.”9 Nor, I believe, would Berkeley dis-
tinguish the case of location from that of distance in the way Armstrong sug-
gests: “I can see immediately that the man is to the left of the tree, and that
the leaves of the tree are above its trunk (more strictly, all I immediately see
are certain man-like, leaf-like, and trunk-like colored shapes arranged in this
way), but I can not immediately see that the tree-like shape is more, or less,
distant than the man-like shape.”10 Berkeley claims, instead, that our visual
field, like our olfactory field, lacks anything comparable to our ideas of both
spatial distance and spatial direction. With regard to distance, however, “all
agreed.” A point anywhere along a line of sight projects the same point on our
retina whether near or far. There is no presentation of the third dimension per
se in the stimulus and, in turn, in our visual field. There is nothing in our
visual field, for example, that increases in size as the distance of the point in-
creases. *[Note, with respect to Berkeley’s Idealist position, a two-dimensional,
mind-independent world is no more welcome than a three-dimensional, mind-
independent world.]
This version of the one-point argument does not depend, as has often been
claimed, on the assumption that distance cues are necessarily ambiguous.
Cues could be unambiguous (e.g., brightness could vary directly with dis-
tance) without affecting Berkeley’s main point here. No matter how unam-
biguously such brightness ideas corresponded to distances, they would not
themselves be ideas of distance. We cannot, therefore, acquire distance ideas,
as we acquire color ideas, on the basis of visual experience alone. A spirit with
sight but no tangible sense could not have our ordinary ideas of space (see
NTV 153–59). Talk of the voluminousness or distance properties of our visual
experience is strictly derivative, reflecting the spatial or tangible significance
we have come to assign to visual phenomena.11
But then, did not the invention of the stereoscope and experiments on reti-
nal disparity show that Berkeley and those who agreed with him were mis-
taken? Many critics have assumed that these findings overturn or severely
challenge Berkeley’s theories. Such claims, however, are particularly puzzling
when one looks at the actual developments in the scientific study of vision. As
Sully reminds us, many prominent theorists (including, to an extent, Wheat-
stone himself) took the stereoscope experiments to support Berkeley’s views.
Why the discrepancy? In order to answer this question, I think it necessary
20 Berkeleian View of Vision
to separate again Berkeley’s different claims about the nature of distance per-
ception [(i), (ii), (iii) above].
Perhaps the easiest misunderstanding to clear up is the idea that Wheat-
stone’s invention proved that distance perception is immediate. For a long
while it had been known that, within a limited range, objects at different dis-
tances from the viewer project noncongruent images on the retina. Only ob-
jects on the plane of focus strike corresponding points on both retinas; the
retinal projections from all other objects strike disparate points (see Figure 1.1).
What the stereoscope showed was that the disparity of the images did
indeed affect or play a role in distance perception. It did not undermine the
one-point argument; rather, it indicated that there was another cue, retinal
disparity, that vision could and did tap in trying to work out distance rela-
tions. According to most models of binocular vision, this was taken to mean
that the visual system first registers disparity information and then uses it to
derive distance. The model was a two-stage operation, and in this way not dif-
ferent from the nonimmediate processing models found in dealing with pic-
torial and kinesthetic cues to distance.
In fact, experiments with the stereoscope were used to argue in favor of a
two-stage solution to another problem that was most prominent in Berkeley’s
Figure 1.1
Retinal disparity: the distance y–x is less than the distance y'–x'.
Seeing Distance from a Berkeleian Perspective 21
time and thereafter. This is the problem of accounting for the fact that we do
not see double even though each eye is capable of producing its own visual
experience.12 According to one account, the organic model, we are wired so
that nerve impulses from corresponding retinal points come together and
merge into a single impulse that then travels to higher brain centers, trigger-
ing but a single experience. The fact that objects not on the focal plane do
not project to corresponding retinal points, therefore, poses a challenge to
organic models of single vision. Moreover, workers like Helmholtz thought
they could demonstrate by means of stereoscope experiments that fusion
does not occur at a neural level and that we do have the distinct experiences
associated with each eye. “These experiments show . . . the content of each
separate field comes to consciousness without being fused with the other
field by means of organic mechanisms; and that, therefore, the fusion of the
two fields in one common image, when it does occur, is a psychic act.”13
If the invention of the stereoscope did not demonstrate that distance per-
ception is immediate, did it not at least deal a blow to Berkeley’s further claim
that distance is not a quality of visual experience? Anyone who has looked
through a stereoscope has experienced the difference between the volumi-
nous quality of these pictures in contrast to the flatness, or two-dimensional
quality, of ordinary pictures. So how, in light of this, could Berkeley maintain
that distance is not an attribute of our visual experience?
Berkeley, I think, would not have denied that the stereoscope scenes look
different or are experienced differently from single pictures. He was obvi-
ously aware that in ordinary vision we see distance better, and our experience
seems more voluminous, when we use two eyes. The reason is that in binoc-
ular vision we have powerful, additional cues, for example, conversion, to aid
in assessing distance. The stereoscope showed that there is one more cue,
binocular disparity, that could help. We have noted, too, that Berkeley did not
claim that our visual field was or looked planar. He says, in fact, that we will
derivatively describe as solid, not planar, those visual experiences that we in-
terpret three-dimensionally. Thus, since disparity enhances our appreciation
of distance, it is not surprising that visual experiences that include disparity
among their cues are described, derivatively, as being more voluminous.
Still, though the stereoscope experiments did not refute Berkeley’s posi-
tion, why were they taken by many to support his ideas, in particular, his
claim that vision lacks spatial properties? Here issues are more complex, and
I can only begin to sketch out the considerations that were operative. By the
22 Berkeleian View of Vision
time Wheatstone invented the stereoscope, perhaps the major schism in vision
research was over the issue of innateness. On one side there were those who,
like Berkeley, claimed that our spatial ideas were derived from sense experi-
ence. On the other side were those who saw themselves as heirs to the “Kant-
ian” tradition and were convinced that we could not acquire our ideas of space
by means of sense. Our ideas of space were an innate imposition of mind. Not
only vision, as Berkeley claimed, but our senses in general were thought to be
inadequate to supply us with our spatial framework. “[T]here is a quality pro-
duced out of the inward resources of the mind, to envelop sensations which,
as given originally, are not spatial. . . . This last is the Kantian view.”14 In turn,
distance perception was not, as Berkeley and others proposed, learned.
On just about every aspect of space perception debates raged over whether
the phenomenon was innate or acquired. The stereoscope experiments, how-
ever, were taken by many prominent researchers to support the “empiricist”
approach on several counts. Two are reasonably nontechnical and worth
mentioning here. First, various experiments were thought to demonstrate the
importance of learning in distance perception, hence challenging innateness
claims. Second, locating in retinal disparity an external physical base for the
fullness, or three-dimensionality, of our visual phenomena meant that it was
that much more reasonable to explain depth perception as dependent on sen-
sory apprehension. It was that much less plausible to assume that spatiality
was a nonsensory imposition of mind. The discovery of the stereoscope
“made the dogma of an innate intuition of space—of space as an inner con-
dition of all experience—less likely than ever before.”15
This is not to say that everyone in the “non-Kantian” camp agreed with
Berkeley that visual experience itself provided no basis for our spatial frame-
work. For example, Ewald Hering, Carl Stumpf, and William James agreed
with Berkeley that our idea of space is not an a priori imposition of mind, but
they rejected the claim that visual experience could play no role in the con-
struction of our spatial ideas. Most radically, James argued that all of our sen-
sations, including odor, taste, and sound, have a voluminous quality that can
serve as a basis for building our conception of space. Still, for James, as well as
most other theorists, distance is not a simple or immediate quality of visual
sensations. James’s claim is only that we can use this sensed voluminousness,
in conjunction with the variations in experience of objects as we move about,
to construct a visual idea of metric space. Moreover, for many researchers the
Seeing Distance from a Berkeleian Perspective 23
implications for any theory of perception . . . [especially] how the senses work
together.”20 Berkeley’s views about the interrelations between the senses, how-
ever, are a story for another occasion. *[See chapter 5.]
Notes
This essay is excerpted from a much longer one on Berkeley’s views on distance per-
ception, which, in turn, constitutes the first chapter of my book Vision: Variations on
Some Berkeleian Themes (Oxford: Basil Blackwell, 1994). Phillip Cummins commented
on this essay at the University of Western Ontario’s conference on Berkeley’s Meta-
physics. I hope I have answered some of his questions in my book.
4. Treatise on Physiological Optics, vol. 3, ed. James Southall, (New York: Dover,
1950), p. 6.
5. Human Knowledge: Its Scope and Limits (New York: Simon & Schuster, 1964), p. 51.
7. For an account of many of these, see Julian Hochberg, “Perception, I and II,” in
Woodworth and Schlossberg’s Experimental Psychology, ed. J. Kling and L. Riggs (New York:
Holt, Rinehart & Winston, 1971), pp. 395–550.
8. “Three Kinds of Distance That Can Be Seen or How Bishop Berkeley Went Wrong,”
in Studies in Perception: Festschrift for Fabio Mettelli, ed. G. Flores D’Arcais (Milan:
Martello-Guinti, 1976), p. 83. It was Gibson’s own work that did much to challenge the
paradigm and assumptions underlying the traditional claim that distance perception
is not immediate.
10. Ibid., p. 5.
11. In other sections of the New Theory, Berkeley argues that the same holds for size,
shape, direction, and orientation. His claims in these cases, however, do not depend on
the one-point argument in the way his distance thesis does.
12. Berkeley himself does not deal with this problem in NTV.
13. Physiological Optics, vol. 3, p. 499. Again, it was not assumed that the average per-
son was aware of or could report on the intermediate sensations. *[More recent studies
26 Berkeleian View of Vision
demonstrating stereoscopic effects with pairs of “random dot” displays are a challenge
to these sorts of theories, because the forms seen with the stereoscope are not perceived
when either member of a pair is viewed by itself.]
14. William James, The Principles of Psychology, vol. 2 (New York: Dover, 1950), p. 252.
Whether James and other perceptual psychologists who cite or appeal to Kant correctly
understood the implications of Kant’s position for empirical theories of vision is a real
question. See Gary Hatfield, The Natural and the Normative (Cambridge: MIT Press,
1950), esp. chap. 3, for the claim that many theorists misunderstood the empirical
implications of Kant’s ideas. Hatfield further argues that Kant’s empirical claims about
vision and touch are much like Berkeley’s: “[Kant] makes vision depend upon touch
for its ability to perceive objects in depth, thereby implying the standard Berkeleian
account” (p. 105).
15. James J. Gibson, The Perception of the Visual World (Boston: Houghton Mifflin,
1950), p. 21.
16. Wilhelm Wundt, Lectures on Human and Animal Psychology, trans. J. E. Creighton
and E. B. Thorndlike (New York: Macmillan, 1896), p. 189.
17. “The Recent Progress of the Theory of Vision,” in Helmholtz on Perception, ed. R. War-
ren and R. Warren (New York: Wiley, 1968) p. 110.
19. Lloyd Kaufman, Perception: The World Transformed (Oxford: Oxford University
Press, 1979), p. 224 ff.
Chapter 2 is excerpted from the first few pages of chapter 2 of VVBT. That
chapter discusses Berkeley’s account of size perception and his criticism of
the “taking account of distance” (TAD) model. According to this model, the
visual system computes physical size by means of geometrical formula that
relates a measure of the magnitude of the retinal image to a measure of the per-
ceived distance to the object. L. Kaufman and I. Rock are important modern
proponents of the TAD model. In their influential paper on the moon illusion
they claim to refute Berkeley’s account. This selection contains a brief re-
sponse and defense of Berkeley.
Later in VVBT’s Chapter 2, little recognized problems with the geometric
assumptions underlying the TAD account put in doubt current versions of
the model. (For issues related to this critique, see Ross, H. & Plug, C., 2002, The
Mystery of the Moon Illusion: Exploring Size Perception. Oxford: Oxford Uni-
versity Press.) In chapter 7 of this volume, comparable concerns provoke re-
thinking the proper understanding of “occlusion” as a cue to depth.
2 Size
In sections 52–87 of New Theory Berkeley considers the question of size per-
ception. “[H]ow is it,” he asks, “that we perceive by sight the magnitude of
objects?”1 Although these sections raise important issues for the theory of
vision, they have received comparatively little examination.2 In part, this is
due to the fact that many commentators assume that the significant philo-
sophical points have already been raised in Berkeley’s discussion of distance
and that nothing new is to be found these sections. In part, it is also due to a
lack of appreciation of major aspects of Berkeley’s theoretical and empirical
claims and how they fit in with early and current work on size perception.
Some of the more recent neglect of Berkeley’s position, I think, may be traced
to a very popular paper by Lloyd Kaufman and Irvin Rock which appeared in
Scientific American.3 In this paper, Kaufman and Rock claimed to have refuted
Berkeley’s own account of the moon (size) illusion, while showing that the
taking-account-of-distance model (hereafter the TAD model) of size percep-
tion, which Berkeley opposed, is really the correct theory.4 The Kaufman and
Rock paper, however, can prove misleading on a few points. It does not take
into consideration Berkeley’s main criticism of the TAD model; nor does it
deal with one of the problems which Berkeley thought his own account could
solve better than the competing TAD theory.
What is “the” problem of size perception? The basic issue confronting the-
ories of size perception has continued to be conceptualized along much the
same lines as it was in Berkeley’s day.5 While the real, or physical, size of an ob-
ject is independent of its distance from an observer, the size of the image that
the object casts on the retina varies with the distance. Figure 2.1 sets out the
problem as it is typically presented in psychological works on size perception.
When an object of constant size h is moved further from the eye, its retinal
image decreases in size. The angle α which the object subtends, the visual
30 Berkeleian View of Vision
Figure 2.1
The size of the visual angle, α, of an object of size h varies with the distance d of the ob-
ject from the observer. *[With some simplifying assumptions; h = α × d.]
angle, is directly correlated with the image size. It is usual practice to talk
about the extent of the retinal image in terms of the size of the corresponding
visual angle.
The problem of size perception, then, is that of explaining our ability to
evaluate magnitude in light of the variability in the size of the visual angles
an object can subtend. Since it was widely assumed that the amount of our
sensed visual field (or, in Berkeley’s terminology, the number of minima visi-
bilia sensed) depends on the extent of the retina stimulated, our immediate
experiences of an object will vary when it is at different distances from us. A
nearby tower will occupy a large portion of our visual field, while the same
tower, viewed from half a mile away, will appear as a speck. Our everyday idea
of an object’s (constant) physical size cannot be identified with each of the
distinct visual ideas we immediately experience when viewing the object from
a variety of distances. Size perception involves a two-step mental process:
our immediate sensation, a function of the amount of the retina stimulated,
and our idea of a constant physical size that this sensation helps to trigger.
According to Berkeley, there is, moreover, no one visual experience that can
be singled out as the correct or veridical visual idea that goes with a given
spatial size.6
By what means, then, are the magnitudes of objects perceived by sight? For
Berkeley, visual extent and familiarity play a role, along with most of the
visual and oculomotor cues cited earlier in his account of the perception of
distance. We have learned to correlate these cues with “real” or tangible mag-
nitude. What is especially important about Berkeley’s model, however, is the
Size 31
way(s) in which it differs from that of the optic writers. The optic writers, too,
held that size perception was not immediate; but they championed a version
of the TAD model of size evaluation. According to this theory, we perceive size
on the basis of an initial or prior evaluation of distance. Given an apprecia-
tion of the visual angle and knowledge of the object’s distance, we can geo-
metrically compute its magnitude. *[According to the TAD model, the visual
system determines/registers the values for α and d, and on the basis of those
measures computes the size, h.]
Berkeley agrees with the optic writers that visual size perception is not im-
mediate, but he denies that it involves an initial determination of distance
and subsequent computation of magnitude based on this idea of distance.
Berkeley offers several reasons for rejecting the TAD model. First, he thinks
that introspection does not reveal the existence of processes of calculation
involving angles and distances. Allowing, however, for the vagaries of intro-
spection, this does not clinch the argument for Berkeley. Second, Berkeley
claims that the TAD model cannot account for certain empirical data as well
as his theory can. He spends a large part of sections 52–87 elaborating this
criticism. In particular, he believes that his own explanation of the moon
illusion, one of the most discussed puzzles in vision theory, is better than any-
thing the optic writers have to offer.
I mentioned earlier that Kaufman and Rock claim to have refuted both
Berkeley’s account of the moon illusion and his critique of the TAD model.
Berkeley had maintained that a primary reason for the moon illusion is the
presence of atmospheric vapor, or mist, between the observer and the moon
when the moon is on the horizon. It is the presence of these vapors, not
simply the presence of the terrain, that causes us to see the moon as larger
on the horizon.7 Kaufman and Rock claim that their experiments show that
Berkeley was wrong about the significance of mist and wrong in denying
the importance of the information that the terrain provides when looking
at the horizon moon. Two points missing from Kaufman and Rock’s article
render their remarks about Berkeley somewhat misleading. A major reason for
Berkeley emphasizing the role of mist was his concern to explain the differ-
ences in perceived size when viewing the horizon moon on separate occa-
sions. This is an issue that Kaufman and Rock do not really address. Clearly,
citing the presence of terrain cannot serve to distinguish these cases. Berke-
ley’s deeper complaint against the TAD model, though, was not over which
cues are the most prominent; rather, it was over the model’s account of the
32 Berkeleian View of Vision
processing that underlies size perception. Berkeley rejected the claim that
size perception depends on the prior evaluation of distance. He did not claim
that the standard “distance” cues do not play a role in the perception of mag-
nitude. On his own theory they do. What he challenged was the appropriate-
ness of labeling these cues “distance” cues, as opposed to calling them “size”
cues. According to Berkeley, the cues serve both functions, and they suggest
magnitude and distance evaluations in the same way. This is not merely a ter-
minological quibble. It marks Berkeley’s rejection of the TAD model’s pro-
posal regarding the processing steps that the visual system actually goes
through in determining size. It is to deny the “psychological reality” of a pro-
cessing stage that incorporates an explicit representation of distance and the
use of this measure to then compute magnitude.
Curiously, Kaufman and Rock point out a difficulty with their own theory
that may be seen to favor Berkeley’s approach. On their TAD account of the
moon illusion, the reason that the moon is said to look bigger on the horizon
is that it is mistakenly perceived to be further away than when it is up above.
Plugging this larger distance value into the formula we use to compute mag-
nitude yields a larger size evaluation for the horizon moon. A major problem
with this explanation, however, is that, if asked to judge the distance of the
moon, people tend to maintain that the moon is further away at its zenith
than it is on the horizon. Quite understandably, many theorists have taken
such distance evaluations to refute the TAD model of the moon illusion. Kauf-
man and Rock attempt to deal with this seeming contradiction to their the-
ory by arguing that although people do make these distance judgments,
these are not the judgments that the visual system relies on in making size
determinations. Such conscious distance judgments depend on an added bit
of “intellectual” reasoning, over and above the initial verdict that the visual
system itself supplies. Kaufman and Rock claim that our visual system really
does see the moon as further away on the horizon than when it is up above,
and that these distance evaluations are fed into the mechanisms of size per-
ception. The difference between these initial distance measures is what ac-
counts for the size illusion. Kaufman and Rock argue, however, that people
then go on to “reason” that since the moon looks bigger on the horizon, it
must be closer. It is such rationalizations that subjects report.8
In later works, Rock elaborates his own version of this position.9 He main-
tains that what gets used in size perception calculation is not the intellectu-
ally influenced distance value, but what he calls the “registered distance.”
Size 33
Rock waffles somewhat when it comes to spelling out what registered dis-
tance amounts to. On one reading, it is an unconscious representation of a
specific distance value. Often, though, he talks as if what are registered are
only the (distance) cues themselves, and that they directly influence size. But
if it is registered cues about distance, not a distance value itself, that play a
role, it would seem that Rock has gone a long way towards accepting one of
Berkeley’s central criticisms of the TAD model.
Notes
3. Lloyd Kaufman and Irvin Rock, “The Moon Illusion,” Scientific American, 207 (1962),
pp. 120–31.
4. Ptolemy is often-cited as the TAD model’s first proponent, and Helmholtz as its ma-
jor modern champion. Both these historical claims have been questioned.
6. See Irvin Rock, An Introduction to Perception (Macmillan, New York, 1975), pp. 71–3,
for some interesting remarks on this matter.
7. Berkeley, also points out that posture and angle of regard play a role. Angle-of-regard
theories have been and continue to be among the more popular explanations of the il-
lusion. Berkeley also allows that we ordinarily spend most of our time looking at ob-
jects situated on the ground and in the presence of other things. This too, he says, can
explain why the moon appears differently on the horizon than on the meridian.
8. For an update on where things stand concerning the moon illusion in general, as
well as discussion of the Kaufman and Rock solution, see Maurice Hershenson (ed.),
The Moon Illusion (Lawrence Erlbaum, Hillsdale, N.J., 1989).
match are considered identical in color. In other systems, matching does not
entail phenomenal identity. For example, it often happens that the pair A, B
match and the pair B, C match, but when presented together A is phenome-
nally distinguishable from C in color. Given this intransitivity of matching
judgments, it is possible to treat A and B as different phenomenal colors, even
though when compared directly they can not be told apart. Matching and
similarity judgements of various kinds also provide data for determining the
phenomenal place order of visibilia. The construction of both color and place
orders depend not only on subjective judgements, but on assumptions about
quality identity and the mathematical mapping conventions employed.
There are advantages and disadvantages associated with adopting these al-
ternative approaches, and the orders derived from them may differ in signifi-
cant ways.
On the basis of such orders, other concepts can be defined. For example,
two visibilia will be just noticeably different in color if no other color comes
between them in the order. Two visual field places will be minimally different
if there is no other place between them in the order. These orders also provide
a means for measuring likeness of colors or places. The degree of similarity
may be calculated in terms of the minimal path separating them in the order.
In systems where matching is distinguished from identity, colors or places
that match can have other colors or places lying between them in the order.
Relative to decisions about identity and individuation, questions can be
raised about the number of items in a sensory order. Consider again, the case
of color. The properties of surfaces or lights that go to determine their physi-
cal color vary continuously, and so the number of physical colors is often said
to be infinite. It is, however, typically assumed that only a finite number of
these physical differences will be discernable in experience. Others will fall
below the threshold detectable by means of the matching paradigm. Such
limitations on human color discrimination provide a basis for the claim that
the number of phenomenally distinct colors is finite. The situation is similar
with respect to visual field places. Although the places in the physical world
may form a continuum, the visual field places need not. On the assumption
that human sensory discrimination is limited, there may be only a finite
number of distinguishable phenomenal places.
Relative to a system of analysis, it is possible to measure visual field magni-
tudes as well. Assorted metrics can be used. One option is to take the entire vi-
sual field as the standard unit and measure phenomenal size as a percentage of
40 Berkeleian View of Vision
the whole. Another option is to take as the unit of measurement visual places
containing no other phenomenal place as part. These “atomic” places may be
considered the “minima visibilia” of the order. And phenomenal size can then
be specified in terms of the number of minimal places a visibile contains.2
All this is admittedly sketchy, and serious conceptual and technical prob-
lems remain. My aim, so far, has been to offer a framework for locating and
better understanding the issues Berkeley faced. In the following sections, I
will fill in more of the details.
The 30 seconds mentioned in [175] and [296] are a visual angle measure of
the image size an object projects from a specific distance and orientation.
Berkeley is best understood as agreeing that in humans “with the sharpest
eyes” a 30 seconds image may typically be the minimal needed to give rise to
visual experience. The image threshold will be larger for those with less acute
vision. As is clear from [218, 296], technically the threshold for experience is
not to be specified in terms of projected image size, but in terms of the mini-
mal extent of the retina that must be stimulated for perception to occur. The
same sized image will project to more or less of the retina depending on fo-
cusing features and the conformation of the eye. And Berkeley raises the issue
whether these may change with the distance the object is from the eye [296].
In [321] Berkeley asks why a minimum is difficult to imagine, and he an-
swers “because we are not us’d to take notice of ’em singly.” Nothing in a vi-
sual experience itself serves to delineate one MV from another within the
field. MV do not come marked with visible borders, nor are visible places ex-
perienced as having gaps between them. And in general there is no need to at-
tend to them individually, since “they not being able singly to pleasure or
hurt us thereby to deserve our regard” [321]. Berkeley’s definition of MV as ba-
sic perceptual or phenomenal elements is compatible with it being difficult,
if not empirically impossible, to have a visual experience of a single, isolated
visible place.3
MV, then, are best thought of as units of measure, developed for the pur-
poses of describing and ordering sensory phenomena in the visual domain.
Berkeley claims throughout the Philosophical Commentaries [343, 346, 438–
439, 462–464, 510], they are indivisible. This is not an empirical discovery;
rather it is built into the way the notion of a MV is specified in his system. By
definition, MV are the simplest place elements; they have no constituent
parts. A visual field extent composed of more than one phenomenal place is
not a MV. For Berkeley, too, our sensory systems are finite. Since there are lim-
its to the number of phenomenal places it is possible to distinguish in experi-
ence, there must be only a finite number of visible places.
MV are to be contrasted with the mathematical points found in geometry
[253, 344–345]. A phenomenal line is not infinitely divisible. As opposed to
the points on a mathematical line, there are only a finite number of MV on a
phenomenal line. Luce remarks (p. 140) that this aspect of Berkeley’s doctrine
“conflicts seriously . . . with the traditional geometry.” The claim that the
42 Berkeleian View of Vision
compounds of more basic color elements—a view that had some currency
before and after he wrote. In the case he cites, in order to experience green
it might be necessary for there to be a mix of yellow and blue MV. Together
the MV would appear green, but no single MV could be green or be experi-
enced as green on its own [502].12 Thus Berkeley’s color compounding model
leaves room to ponder whether a MV may not have the color it appears to
have. (See also [242].)
Moreover, related considerations may have given Berkeley reason to think
there could be a need to accommodate the idea that single MV would be phe-
nomenally colorless. Berkeley notes [664] “Colours are not devoid of all sort
of Composition. tho it must be granted they are not made up of distinguish-
able Ideas. . . . Men are wont to call those things compounded in which we do
not actually discover the compound ingredients. Bodies are said to be com-
pounded of Chymical Principles whch. nevertheless come not into view till
after the dissolution of the Bodies. & whc. were not could not be discerned in
the bodies whilst remaining entire.” Experiments might establish that expe-
rienced compound colors require the contribution of more than a single phe-
nomenal place. Although all MV are experienced as colored, it might be best
to think of the individual MV that constitute an experienced compound as
not actually having the elementary composing colors and thus having no
color at all. Indeed, if as Berkeley suggests, all colors are actually compounds,
it might be necessary to assume that a single MV could not be experienced to
have a color independent of the contributions of neighboring MV. Since MV
may not be singularly experienced, however, this claim is consistent with the
idea that no MV can be perceived uncolored. Likewise, it would not prevent es-
tablishing a place order, since construction of a sensory order does not rest on
comparisons and judgements of MV isolated in experience.13
Q: Could sight be enlarged by diminishing the point [175]?
A: Earlier it was mentioned Berkeley agrees that a retinal image of 30 seconds
may be the minimal size needed to trigger a visibile. The 30 seconds are pre-
sumably the threshold for those with the sharpest eyes. His treatment seems
to allow, though, that if it took less retinal area to trigger a MV, the visual field
could contain more MV. This is what may distinguish acute and dull sight,
not a difference in the size of the MV itself as “others are apt to think” [250].
Note that this does not mean that use of a microscope diminishes the size
of MV or enlarges the visual field. A microscope alters the size of the image
projected and permits seeing smaller things. It does not change the retinal
46 Berkeleian View of Vision
threshold for triggering a MV. Nor does a microscope make the one and same
item appear physically bigger, since such size estimates depend on more than
visual field magnitude. (See chapter 2.) As we approach a tower, for example,
the visual image grows, yet the tower is perceived as being of a constant phys-
ical size. In a way a microscope exposes us to a different world. We may see
things we did not see before, tiny mites or gaps in a line. Could the visual field
be larger, though, if the retinal threshold for MV were less? The answer here is
yes [219]. In terms of total MV magnitude, the visual field could contain more
minimally discernable points. At the same time, the visual field will not take
in a wider span of physical space. It will only reveal the space in finer detail.
It is important to keep track of these distinctions when considering Berke-
ley’s discussions of comparative size differences of MV. By definition, MV are
least discernable places in a phenomenal order. As the basic units of measure,
all MV have measure one. So every creatures’ MV are of the same phenomenal
magnitude [272, 277]. “The visible point of he who has microscopical eyes
will not be greater or less than mine” [116]. Visual systems may differ, nonethe-
less, in the extent of the physical world they can take in at a glance, in the min-
imal area of retinal stimulus capable of triggering a MV, and in the amount
of the retinal surface a visual image of a given size will occupy (with different
conformations of eye [296]).
All claims about phenomenal magnitudes and visual field sizes have to be
understood relative to the conventions of the system of measurement em-
ployed, and as previously noted Berkeley seems sensitive to the issue. Mea-
suring phenomenal magnitude, not by MV, but as a proportion of the entire
visual field, yields different answers to the same questions. If the whole visual
field serves as the metric, then, by definition, visual fields do not differ in mag-
nitude. All visual fields will have the same unit size. In such a system, too, MV
need not be assigned identical magnitudes. The MV of fields composed of dif-
ferent total numbers of MV will occupy different proportions of the entire
field. Also using this metric, loss of retinal function will not diminish visual
field size. It will instead increase the proportional phenomenal size of the
least discernable places.
Paradox arises when these and related distinctions are not kept in focus. We
are lured, for example, into thinking there is a real fact of the matter as to
whether the MV of a person and that of a mite have the same phenomenal
magnitude in some more absolute sense. We picture superimposing a MV
from each and then seeing if one appears to extend beyond the other. But
Berkeley argues [272], strictly speaking, this situation is not really imagi-
Making Maximum Sense of “Minimum Sensibile” 47
can be distinguished. First, the spirit’s visual field could be larger or more in-
clusive in the sense that it might take in a greater span of the environment in
one view. Second, the spirit’s visive faculty could be more sensitive, respond-
ing to images of smaller size. Both of these “perfections” might result in the
spirit seeing more MV than we do. Of course, the whole idea of the making of
such measures with spirits is a murky business, and understandably Berkeley
does little more than speculate. And as he remarks in [410], “God knows how
far our knowledge of Intellectual beings may be enlarg’d from the principle.”
In any case, the possibility of these perfections of the visive faculty should
not be confused with the expansion of vision a microscope provides. The lat-
ter does not increase the number of MV experienced.
of mountain than those of the sill.15 And it is the phenomenal visual place or-
der that is claimed to be two-dimensional. This does not mean, however, that
the visual field is physically flat. The visual field has no physical spatial di-
mensions; it is not “an orb, any more than a plain” [204]. (See also NTV 158.)
That we talk of length and breadth in both visible and tangible domains
does not mean they are commensurable in these properties. As Berkeley in-
dicates we talk of length in the auditory domain when measuring temporal
spread. And talk of distance whether it be between two points in a line or
as he says in [447], “between a slave & an Emperour, between a Peasant &
Philosopher, between a drachm & a pound, a farthing & a Crown . . .” always
“signifies the number of intermediate ideas” in an order. There is, of course,
an important difference that Berkeley recognizes. Among the sensibilia vis-
ible and tangible alone have extension and can be ordered with respect to
phenomenal locations. So it is possible to measure place distances and mag-
nitudes in these orders. Nevertheless, the visible and tangibile units of mea-
sure are qualitatively distinct and cannot be combined [70, 295]. Hence
Berkeley maintains there is no inconsistency with his heterogeneity thesis.
(See chapter 4.)
What’s more, Berkeley’s assigning of location to visual places with respect
to height and horizontal direction in the field was common in his day and has
not been abandoned by many of those who seek to describe phenomenal
place orders. The hypothesis, though, is not a priori. It depends, as all such
sensory measures do, on the nature of the stimuli, the workings of the sensory
system, and the individuation and mapping conventions employed. And for
many now, as for Berkeley then, a two-dimensional ordering of phenomenal
place has seemed most plausible, given the foundational empirical claim that
“distance being a line directed endwise to the eye, it projects only one point
in the fund of the eye, which point remains invariably the same, whether the
distance be longer or shorter” (NTV 2).
Notes
* I wish to thank Margaret Atherton and Peter Ross for discussion and comments.
4. See Jesseph 1993 for Berkeley’s problems dealing with these matters.
5. It is important to keep in mind throughout that questions about the structure and
organization of the phenomenal visual field are to be distinguished from questions
about the geometrical properties of the visual world (in other words, the physical envi-
ronment as revealed by vision). See especially the discussion of two-dimensional ver-
sus three-dimensional place orders in the last section of this paper.
6. For alternative readings, see Bracken 1974, Raynor 1980, and Jesseph 1993.
7. Hume, too, uses simultaneity to argue that extension is a property of sight and
touch and only them.
9. See Armstrong 1960, Gray 1978, and Jesseph 1993 for such concerns.
10. Also see [365]. Similarly, note that a mathematical point does not have a specifiable
geometric shape within the system, although two or more points have/determine a
shape. (See Goodman 1977, p. 252.) I am not claiming that Berkeley offered this an-
swer to the shape problem or even considered it. See chapter 4 of this book for a discus-
sion of phenomenal visual shape.
12. One might think here of an analogy with the color dots that constitute a television
display. Although we see a gamut of colors, the actual screen pixels are of just three
hues. None of the compound hues is to be found or seen in any single pixel.
14. Luce (1989) and Falkenstein (1994) offer readings of this quote that differ from
mine. I find theirs less satisfactory because I do not think they can explain either why
the first type of perception is confused or why the second type is a kind of visual exten-
sion, measured in terms of MV rather than MT.
15. Such discrepancies occur whenever we perceive a physical edge. At the visual edge
there are no places between the edge and that which is on the other side of the edge. In
physical space there are physical spaces between them.
Making Maximum Sense of “Minimum Sensibile” 51
References
Berkeley, G. 1989. Philosophical Commentaries. G. Thomas (ed.). New York: Garland Press.
Berkeley, G. 1948. An Essay Towards a New Theory of Vision in The Works of George Berke-
ley. Volume 1 A.A. Luce and T.E. Jessop (eds.). Edinburgh: Thomas Nelson.
Carnap, R. 1928. Der logische Aufbau der Welt. Berlin: Weltkreis Verlag.
Gray, R. 1978. “Berkeley’s Theory of Space.” Journal of the History of Ideas 16, 415–434.
Hatfield, G. & Epstein, W. 1979. “The Sensory Core in the Medieval Foundations of
Early Modern Perceptual Theory.” Isis 70, 363–84.
Hume, D. 2000. A Treatise of Human Nature. D. Norton and M. Norton (eds.). Oxford:
Oxford University Press.
Moked, G. 1988. Particles and Ideas: Bishop Berkeley’s Corpuscularian Philosophy. Oxford:
Clarendon Press.
O’Shaughnessy, B. 1980. The Will: A Dual Aspect Theory, Vol. 1. Cambridge: Cambridge
University Press.
Raynor, D. 1980. “Minima Sensibilia in Berkeley and Hume.” Dialogue 19, 196–200.
This selection offers a new twist on Berkeley’s views concerning common sen-
sibles and the heterogeneity of the senses. An understandable complaint
about this interpretation is that it is not one Berkeley would find palatable. I
am not convinced this is so. I think the analysis makes better sense of his over-
all commitments and theories than more standard readings. I do not doubt,
however, that one can find passages in Berkeley at odds with points in my ac-
count. On the other hand, I believe his heterogeneity arguments are more
consistent and compelling in the interpretation proposed. The aim of this se-
lection, though, is to explain Berkeley’s position, not defend it.
4 Heterogeneity and the Senses*
Sensory Minima
the doctrine arise, and they arise primarily with respect to spatial properties.
The experienced color of a lemon may not be comparable to the experienced
resistance of its surface, but it is maintained that the situation is quite differ-
ent when it comes to properties like shape. The visual and tactual experience
of the lemon, for instance, resemble each other or share the property of having
an ovoid shape. Shape, then, seems to be a clear case of a common sensible.
So it is argued, neither Berkeley’s claim that the senses have no ideas in com-
mon, nor his negative answer to Molyneux are justified.
Number
A more careful examination of NTV, I think, shows that this refutation of the
heterogeneity doctrine moves too quickly. Consideration of Berkeley’s treat-
ment of “number” can help explain why. Berkeley insists that enumeration or
assigning cardinality to things always presupposes a sortal. “We call a win-
dow one, a chimney one, and yet a house, in which there are many windows
and many chimneys, hath an equal right to be called one, and many houses
go to the making of one city” [109]. It makes no sense simply to ask “How
many?” or to compare cardinality without specifying how many of what is
being counted. Moreover, arithmetic operations on numbers are only well
defined where a common unit is assumed. One house plus ten windows does
not sum to eleven.
Now Berkeley is perfectly willing to assign cardinalities to sets of sensory
items. He has no problem with reports that someone experienced two visible
color patches of yellow, two audible sounds of C sharp, and two distinct pres-
sure sensations, at the same or different times. At first blush, then, it might
seem that Berkeley’s heterogeneity thesis faces a difficulty. Experiences from
different senses can share a property, and an abstract property at that. In the
case above, sensory arrays from distinct modalities share cardinality or have
in common the property “two-ness.”
But no amount of blushing will make it plausible that Berkeley would find
such an observation a serious challenge. The fact that experienced ideas in
separate sensory realms share number, does not mean that they are qualita-
tively alike or resemble each other. As noted, everyone agrees that experi-
ences of a single yellow patch, a single citrus odor, a single tart taste, and a
single area of felt pressure are qualitatively distinct sorts of sensory experi-
ence, even though each is an instance of “one-ness.” Such similarity of cardi-
Heterogeneity and the Senses 59
nality is not taken to imply that the idea of one-ness is a common sensible.
There seems little reason to think, however, that the situation changes sig-
nificantly when the sensations in each modality come in pairs or share higher
cardinalities. Number is not the kind of property that characterizes a consti-
tutive attribute of sensations. It is not a dimension along which sensations are
compared and ordered when characterizing and mapping their experiential
qualities within a sense realm.
As Berkeley says, number depends on the mind making a “perfectly arbi-
trary” choice of the units of enumeration, and this choice is constrained only
by considerations of what is “most convenient” for the task at hand [109]. Al-
though experiences of sight and touch can match in number, the sensations
constitute different sensory domains. In an ordering of color (such as the color
sphere) there are no pressure sensations any more than there are sounds. It
does not make sense to add two color patches to two pain sensations or to two
C sharps, since the sortals are different. Arithmetic operations can only be
employed when a common unit is set. Still, certain comparisons of number
within or across sense realms may have uses. Faced with the need to count the
number of items in one sense domain by means of markers in another, it
would be (psychologically) natural to correlate two color patches with two
sounds, three color patches with three sounds, and so forth. Thus it may be
said that 2 C sharps are fitter to represent two yellow patches, than one, three,
or some other number of C sharps. But there is no necessary connection be-
tween seeing yellow patches and hearing C sharps, and there is no way reason
alone can deduce cardinality assignments in one realm from those in another.
Perhaps number is special, since it is not strictly speaking a sensory prop-
erty. The fact that the argument against the heterogeneity doctrine does not
go through with number does not preclude there being other common prop-
erties that are relevant. Indeed, challenges to heterogeneity usually focus on
spatial properties, such as distance, size, and shape.
marked out by the number of interjacent visible points: If they are tangible,
the distance between then is a line consisting of tangible points; but if they
are one tangible and the other visible, the distance between them doth nei-
ther consists of points perceivable by sight nor by touch, i.e. it is utterly in-
conceivable” [112]. Berkeley’s idea of distance is unambiguous and generic.
Distance is the number of points or places between two points in a phenom-
enal order. So defined, the predicate ‘distance’ can be applied to both sight
and touch. Ideas of sight and ideas of touch, nonetheless, are heterogeneous
and incommensurable. Adding minima visiblia (MV) to minima tangiblia
(MT) is inconceivable. There is no way to sum or apply mathematical opera-
tions to different unit measures.
Berkeley, in fact, has a quite sophisticated conception of order and measure
in sensory realms. His idea of phenomenal distance is not specifically limited
to sight and touch. Were a new sense modality to turn up and have its own
phenomenal extension, his definition of distance would be applicable. In ad-
dition, for Berkeley the concept of distance applies not only between places
in extension but to other phenomenal orders. Two colors, for example, can be
measured for the distance between them in an ordering of colors (such as the
standard color sphere mapping of color experience). Berkeley also notes that
his abstract idea of distance can be applied to non-sensory orders. He says in
Philosophical Commentaries [447], “A line in abstract or distance is the number
of points between two points. There is also distance between a Slave & an Em-
perour, between a Peasant & Philosopher, between a drachm & a pound, a
farthing & a Crown etc in all which distance signifies the number of interme-
diate points.”
It should be apparent, though, that distance in phenomenal extension is
just a particular case of magnitude measurement. It is the size between two
points in an array. (See Schwartz 1994 for implications.) Since magnitude is
the more general concept it will simplify discussion to focus on it. Suppose, as
Berkeley proposes, visual size and tangible size can in principle be measured,
employing as units the minima sensibilia (MS) characteristic of each sense.
Parts and wholes of visual arrays are measured in terms of the number of min-
ima visiblia. This array may contain 200 MV and that one 400 MV, and the to-
tal combined visual size is 600 MV. Similarly, the size of tactile arrays can be
tallied in terms of the number of minima tangibilia that compose them. It is
most important to keep in mind throughout this discussion that we are talk-
ing about phenomenal size, a magnitude measure of sensory experiences.
These size measures are not properties of the physical objects that may be
Heterogeneity and the Senses 61
their source.5 Berkeley takes pains, for example, to remind us that a physical
inch has no single visual size [61]. Up close it may occupy the entire visual
field. As it moves away, its visual size diminishes. Eventually the inch-long ob-
ject can no longer be seen. It has no presence in visual experience. No visual
field places are occupied.6 By contrast, viewed under a microscope a small seg-
ment of an inch may occupy the entire visual field.
Although the magnitude of visual arrays and tactual arrays can both be
measured in terms of their respective minima sensiblia (that is, MV and MT);
so enumerated their sizes are incommensurable. They are not amenable to
arithmetic operations: 200 MV plus 200 MT do not sum. Most significantly,
it is incorrect to assume that a visible size of 200 MV is equal or equivalent to
a tangible size of 200 MT. It is meaningless to assign a phenomenal visual size
measured in MV to a tangible array measured in MT, and vice versa. There is
no common experiential field or area of phenomenal place that both 200 MV
and 200 MT can coherently be said to occupy to the same extent.
Berkeley makes his views about the connection between units and hetero-
geneity quite clear in [131]. It is “an axiom universally received that quanti-
ties of the same kind may be added together and make one intire sum. . . .
kinds of quantity being thought incapable of any such mutual addition, and
consequently of being compared together in the several ways of proportion,
are . . . esteemed intirely disparate and heterogeneous. . . . Now let anyone
try in his thoughts to add a visible line or surface to a tangible line or surface
so as to conceive them making one continued sum or whole. He that can do
this may think them homogeneous: but he that cannot, must by the forego-
ing axiom, think them heterogeneous.”
When the unit used to measure both sense realms is the more general sor-
tal, minimum sensible, the situation is different. One may add 200 MS (visi-
bilia) and 200 MS (tangiblia), but the 400 MS total does not characterize size
in either sensory modality. In fact, 400 MS is not the measure of the experi-
enced magnitude of an array in any sensory order. It is not “one continued
sum or whole.” Thus it is questionable what use there would be for such a
tally. Of course, it is possible in principle to compare the number of MS in a
visibile to the number of MS in a tangible and conclude that the arrays con-
tain the same or different number of MS. And were there a need to keep tabs
on the size of tangibile arrays using items from the visual field, it would un-
doubtedly be more convenient to have larger arrays of MV represent larger
arrays of MT. It would be simpler, and perhaps even more useful, if ratio prop-
erties of the orderings are also preserved. For example, if one tangible array
62 Berkeleian View of Vision
is twice the size of a second, the size of the visible arrays representing them
should also be in a two to one ratio. These schemes would be pragmatically
fitter than arbitrary correlations. But “fitter” is not meant to imply that 200
MV resemble, match up better, or have a necessary connection to 200 MT. Like-
wise, preservation of ratio relations between these magnitude measures does
not show that arrays of sight and those of touch phenomenally resemble one
another. The two extensions are incommensurable.
In turn, the length or size of a physical object can not be determined by
summing the MS from both sense realms. Adding the MV of the experience of
the right half of a physical rod to the MT experience of its left half does not
give a coherent measure of its length [131]. Nor is there any reason to assume
that the number of MV will be the same as the number of MT. Although the
number of MT of a physical inch may be fixed, there is no unique number of
MV that can be assigned to that physical length. Depending on the viewing
distance and angle, the visually experienced inch may occupy the whole vi-
sual field, a single point, or any number of MV in between these extremes.
Shape
Berkeley’s denial that shape is common to vision and touch and his negative
answer to the Molyneux question are usually thought to be the least tenable
strands of his heterogeneity doctrine. Although comparisons of number,
either as measures of cardinality or size, may not be relevant dimensions
along which to evaluate the thesis, properties of shape seem to be another
matter. From Berkeley’s perspective, however, the difference between shape
properties and size properties is not one to challenge his heterogeneity doc-
trine. For Berkeley correctly points out that “figure is the termination of mag-
nitude” [105, 124].
In principle, shapes can be defined according to the distribution of relative
sizes fixing figure boundaries. For example, a phenomenal array that is both
closed and bounded by three straight lines is triangular. And if the array is visual,
it is then a visual triangle. A visual shape property, of course, can not be equated
with any single set of visual magnitudes. Shape is a structural feature of an array.
Visual arrays having different overall magnitudes can share shape. The same is
true of physical shapes; they too are structural properties and come in all sizes.7
Experienced visual shape does alter with tilt of the physical object off
the fronto-parallel plane or with changes in the observer’s angle of regard.
Heterogeneity and the Senses 63
A Puzzle
whether sounds, tastes, and smells are inseparable from extension, since the
experiences in these domains simply have no phenomenal place dimensions.
Berkeley does maintain, and insistently so, that we cannot make sense of an
abstract idea of extension. We can not experience or imagine a visible or tan-
gible extension as it is, bereft of all other sense qualities. There are no such
items as property-less places for the term “extension” to denote. Extension
understood so as to apply to places having no sensible qualities is unintelli-
gible and unimaginable, just as applying the concept “triangle” to a figure
that is not scalene, isosceles, or of any other determinate triangular shape is
incomprehensible. The generic ideas “extension” and “triangle” can be prop-
erly used to describe and denote actual experiences of both sight and touch.
It is when these ideas are employed too abstractly that the terms “extension”
and “triangle” are devoid of empirical content and cognitive meaning.
Conclusion
Berkeley, like most others, assumes that the experiences of each modality are
qualitatively distinct. Sensations of sight, sound, touch, smell, and taste are
not at all like one another phenomenally. Hence, cross-modal linkages can
not be explained in terms of similarity or resemblance of qualities. He also
thinks it implausible that the appropriate connections could be established
by reason. No amount of thinking about the smell of an item will enable you
to determine in advance what the phenomenal experiences of color, taste,
sound, and resistance it affords will be like. Sight and touch, though, both
have place qualities, and each can be ordered with respect to their own place
locations. So it may and did seem obvious to many that nothing should pre-
vent shape ideas from being common sensibles. According to Berkeley, the
problem with this suggestion is that experienced visual extension itself is not
phenomenally like experienced tangible extension. Although both are un-
ambiguously called extensions, the extensions are incommensurable. They
cannot be combined, and arithmetic or geometric operations that attempt
to do so make no sense. There is no common unit of pure extension that can
serve to measure, compare, or unite visible and tangible extensions. Indeed,
the very idea of extension as it is, devoid of any of its accompanying visual
or tangible qualities is incoherent. It presumes the very kind of abstraction
Berkeley claims is unimaginable.
Heterogeneity and the Senses 67
Notes
* All bracketed section references are to Berkeley’s New Theory of Vision. I wish to thank
Margaret Atherton and Laura Berchielli for comments on an earlier draft. I have also
benefitted from reading some unpublished work of Martha Bolton on these issues.
68 Berkeleian View of Vision
1. See M. Wilson, “The Issue of Common Sensibles in Berkeley’s New Theory of Vision”
in Ideas and Mechanisms, pp. 257–75. Princeton: Princeton University Press, 1999, and
L. Falkenstein “Intuition and Constructivism in Berkeley’s Account of Visual Space.”
Journal of the History of Ideas 32, 1994, pp. 63–84.
2. See G. Evans, “Molyneux’s Problem” in Collected Papers, pp. 364–99, Oxford: Oxford
University Press, 1985.
3. Elsewhere in statements of his thesis, Berkeley replaces the term “idea” with the
expression “sensory idea.”
4. For many theorists, including Berkeley, differences in their qualities is the basis
for individuating sensory modalities. This topic is explored in a number of papers
in Perception.
5. I leave at present the issue whether physical objects and properties are to be identi-
fied with tangible experiences as Berkeley tends to do in the NTV or whether the notion
of a “physical object” is better understood as a composite of experiential material from
all sense domains as Berkeley seems to hold in his later, more explicitly Idealist, works.
6. Note that physical places on the retina may be occupied, but the stimuli may not be
of sufficient size or strength to trigger visual sensations.
7. I avoid the further complications that arise in the case of shapes that cannot be spec-
ified by a single structural analysis. For example, structurally different arrays may all
fall under the concept of “the letter A.”
10. After all, we can talk separately and meaningfully of brightness, saturation, and
hue although no color can be experienced without all three.
11. This sort of conflation is surely one reason people so readily assume, as mentioned
earlier, that the ovoid shape of a lemon must be a common sensible.
12. Chapter 5 spells out the implications this essay has for understanding Berkeley’s
answer to the Molyneux question and “man born blind” thought experiments.
Prescript 5
NTV. Prior to his discussion of shape, Berkeley considers the perception of dis-
tance, magnitude, and orientation, and he appeals to MBB tests in each. In
these cases, the cautionary “with certainty” does not qualify his predictions.
He says in [41] that the MBB’s inability to perceive distance on gaining sight
“is manifest.” In [79], he asserts that “we may safely deduce” that a MBB will
initially fail in his attempts to judge the magnitude of objects placed before
him. And in his account of orientation, Berkeley claims “it plainly follows”
that the MBB “would not at first sight think that anything he saw was high or
low, erect or inverted” [95].
Exploring Berkeley’s treatment of these other MBB thought experiments, I
believe, provides important context for understanding his No answer to the
question Molyneux poses. For it is most unlikely that interpretations and
criticisms of Berkeley peculiar to his treatment of shape perception can get to
the heart of his views. Berkeley’s account of figure is part and parcel of his
overall theory of spatial perception and must find a place within it. Paying at-
tention to the full range of MBB thought experiments in NTV can also help
explain why Berkeley is guarded in his answer to Molyneux.
Throughout the NTV, Berkeley tends to take it for granted that if a connection
between ideas is not necessary, it must be learned and vice versa. Without this
assumption, the probative value of empirical evidence resulting from MBB
experiments is dubious. Yet both opponents and supporters of Berkeley’s the-
ory of vision have held that this critical assumption is not correct.
Many agree with Leibniz, who argues against Locke that ideas may be neces-
sarily connected without reason being aware or immediately able to appreciate
that they are. It can take some thought to figure things out. Mach challenges
the significance of negative Molyneux findings along different lines.1 He
points out that both humans and animals often are unable to recognize two
presentations of a shape as the same if the figure is experienced in different ori-
entations. For instance, people are frequently unaware that the diamond shape
they perceive is a square rotated 90 degrees. Hence, Mach argues, mere failure
to appreciate shape identity does not support a strong heterogeneity doctrine.
Alternatively, Mill defends Berkeley from critics who say that his theory of
vision is refuted by empirical evidence concerning animals, and perhaps ac-
tual MBB experiments. Mill argues that it is not damaging to Berkeley’s over-
What Berkeley Sees in the Man Born Blind 73
all thesis that the newly sighted may be able to navigate the environment
without prior experience. After all, a sound might be innately set to trigger an
experience of fear, although the experiences of sound and fear are not alike
and have no necessary connection. Correlations of very distinct ideas can be
wired in at birth, and Mill suggests that the proper explanation of evidence
conflicting with Berkeley’s predictions could be that the correlations are in-
nate. That aspects of sight and touch are correlated at birth does not show
that spatial ideas of the two senses are similar or related by necessity.2
It is not surprising Berkeley did not contemplate the evolutionary possibil-
ity that the experience and fate of past generations can alter the capacities of
their descendants. On the other hand, Berkeley is not in a position to rule out,
a priori, the possibility of these sorts of innate linkages. God could have set
things up so that the language of nature is not only uniform in all environ-
ments, but is given to everyone as a birthright.3 As the history of MBB exper-
iments indicates, though, Berkeley was not the only one to run together issues
of innateness with claims of heterogeneity.
Initial Experience
In [130] Berkeley says “in a strict sense, I see nothing but light and colours
with their several shades and variations.” He says similar things in other
places, sometimes substituting “immediately see” for “in a strict sense see.”
These statements can encourage the view Berkeley held that, at least initially,
the visual field is without internal organization or that the structure it does
have can not be appreciated. On these assumptions, it would be impossible
for the MBB to judge or navigate his environment on first gaining sight, thus
explaining Berkeley’s negative answers to all the MBB thought experiments.
There are a number of reasons why I do not think this is the correct under-
standing of Berkeley’s position: (1) Berkeley never explicitly says that visual
extension is unorganized or its organizational features inaccessible at any
stage of development, and I do not believe quotes like [130] indicate that he
endorses such positions. (2) The assumption that the visual field of the MBB
(or a newborn) on gaining sight is unorganized or its ordering of no useful im-
port does not accord well Berkeley’s and other visual theorists’ characteriza-
tion of the problems of spatial perception. Nor would such an explanation of
the MBB’s failure help Berkeley support his own account of these issues. (3) It
does not explain why Berkeley is more reticent in the case of shape than in his
74 Berkeleian View of Vision
answers to the MBB thought experiments for distance, magnitude, and ori-
entation. If at test time the MBB’s visual field is without discernable structure,
why should Berkeley be more cautious about figure than he is with other spa-
tial properties? I assume objection (3) needs no defense: (1) and (2) do, and I
will address each in turn.4
Immediate Perception
Although Berkeley does say in several places that all we immediately see is
light and color, in other passages he is not so limiting in his characterization
of immediate perception. For example, in Theory of Vision Vindicated [TTV,
44] he maintains, “The proper immediate object of vision is light, in all its
modes and variations, various colours in kind, in degree, in quantity; some
lively, others faint; more of some and less of others; various in their bounds or
limits; various in their order and situation.” (Emphasis added.) Later he explains,
“These immediate objects [of sight] are the pictures. These pictures are some
more lively, others more faint. Some are higher, others are lower in their own
order or peculiar location . . .” [TTV 54, emphasis added]. What’s more, there is
a perfectly good interpretation of statements like [130] that does not have the
implication that the visual field is initially, or for that matter ever is, without
appreciable phenomenal order.
In discussing the nature and function of sensory systems it was quite cus-
tomary (and to some extent remains so) to individuate modalities in terms of
the qualities they present. Strictly speaking, the phenomenal product or ob-
ject of our auditory system is sound in all its variations (loudness and timber);
that of the palate is taste, that of olfaction is smell, that of touch is pressure,
and that of sight is light and color. Theorists from ancient times on, includ-
ing those committed to common sensibles, were quite willing to characterize
the immediate objects of perception in just this way. Light and color are the
experiential objects or qualities that constitute and differentiate the sensory
domain of vision.5
There is nothing, however, in this standard specification of the proper ob-
jects of the modalities that precludes the products of sense from having an ex-
perienced internal phenomenal ordering. In particular, it does not mean that
visual extension and tangible extension, of either the MBB or infants, are orig-
inally without useful structure. Indeed, Berkeley would be especially hard put
to get his motor theory of vision off the ground if the fact that felt pressure is
What Berkeley Sees in the Man Born Blind 75
the proper object of touch means that tangible experiences are initially unor-
ganized and bear no place relations to each other. Berkeley does maintain that
vision and touch are special in having phenomenal place orders. Other sense
organs may be employed to evaluate spatial relations indirectly, but these
modalities, unlike sight and touch, do not have extensions of their own.
I believe an ambiguity in the notion of “strictly see” or “immediately see”
is a source of some of the confusion in discussions of this issue. Presented with
a stimulus that triggers a circular yellow visual array, people who do not have
the concept “circular” will not judge or describe the array as circular, and they
may have no reason to segregate the circular array from adjacent parts of the
visual field. Nonetheless, if all points on the perimeter of a solid array of
yellow are phenomenally equidistant from a point in the center, the yellow
patch has a circular shape in visual extension.6 We see a circular array, al-
though we do not see it as being circular and may have no reason to separate
or discriminate the figure from its phenomenal surroundings. Failure of the
subject to conceptualize the array as a circle, does not prevent figure/surround
type descriptions from being applied to the visual field.7 In addition, if asked
or tested, a subject may have no difficulty distinguishing the yellow colored
array from, say, the black array that borders it.
Spatial Perception
pothesis. Although the moon looks bigger on the horizon than at its zenith,
Berkeley insists that what is immediately seen is the same size in both loca-
tions. The sensations that prompt the illusion do not change in magnitude,
because the size of the retinal image the moon projects remains constant. The
moon illusion is a perception. We read through the constant sensation to an
illusory perception.
However other theorists conceive the MBB’s initial visual experiences, Berke-
ley assumes they bear a proportional relation to retinal image stimuli. The
MBB’s task is not conceived to be a practical impossibility, as it would were the
MMB unable to tap any structural features of visual extension. Berkeley’s ar-
gument is on a more theoretical plane. In the Molyneux experiment the ques-
tion put to the MBB is not “Do you discern any pattern at all in your visual
experience?” Instead, he is asked whether he can see “which is the globe,
which the cube” [132] (the challenge for the MBB is to determine which array
in his visual field is of the tangible globe and which of the tangible cube)? The
question seems to suggest that the MBB gives some content to the demonstra-
tive elements embodied in the asking. True, Berkeley does say that on gaining
sight the MBB is likely to be somewhat baffled. He attributes this to two fea-
tures of his test situation. First, Berkeley believes that initially the MBB would
not perceive anything as being anywhere but in his own mind. Second, the
MBB will not have any good reason to separate or draw figure/surround bound-
aries one way or another. Nothing in principle, though, prevents their being
salient. Berkeley does not base his MBB predictions on these factors, and re-
moving such sources of confusion will not ensure passing the test.
More significantly, should the MBB’s failures be due to either confusion or
a lack of order in his visual field, the MBB thought experiments would be of
less use to Berkeley. Both Berkeley and his critics agree that the MBB will ac-
quire the visual ability to discern physical figures and will adopt the standard
spatial vocabulary to describe them. The difference is that one party to the de-
bate attributes these accomplishments to resemblances or necessary connec-
tions. Their Berkeleian opponents reject phenomenal similarity or necessary
connections as the explanation.
Berkeley realizes people have strong intuitions that visual figures can and
do resemble their tangible figure counterparts, and he understands the rea-
sons for their view. Acceptance of the constancy hypothesis promotes the at-
titude, as does the fact that we automatically read through visual sensations
to their tangible meanings. The use of the same terms to describe properties
78 Berkeleian View of Vision
in both sense realms also has a major influence. And a penchant for con-
fusing the visual perception of tactual exploration of space with the tan-
gible sensations experienced during tactile exploration is another source of
the conflation.
Berkeley sees the need to address these mistaken views. His goal is to show
that visual and tangible shape experiences are distinct in spite of the fact that
arrays of visible extension and tangible extension have discernable figures.
Once the visual and tangible realms become correlated, however, it is more
difficult to appreciate that they are neither related by resemblance nor reason.
The MBB thought experiments are meant to help overcome these prejudices
that come along with the acquisition of visual skill and linguistic sophistica-
tion. But it is important to keep in mind that Berkeley’s ultimate goal is to
prove that, despite indications and intuitions to the contrary, sight and touch
are always heterogeneous. They remain distinct after, as well as before an in-
fant or MBB coordinates visual and tangible extensions and acquires visual
skill. The MBB experiments are germane to this overarching goal only on the
assumption that what the MBB immediately sees is essentially the same as
what the sighted strictly sees.11
Learned Organization
Of course, this conception of the problem of spatial perception does not, by it-
self, rule out the possibility that visual extension initially has no (appreciable)
structure. The MBB (or newborn) may first have to put visual extension into a
usable form. Only after this has been accomplished can learning of sight and
touch correlations take place.12 Although this developmental scenario is a
possibility, it is not one that Berkeley could readily accept. Berkeley’s and his
opponents’ descriptions of the thought experiments require the MBB make
his judgments on first gaining sight. There is no time available for the postu-
lated internal organizational process to occur. Moreover, were this objection
finessed, another puzzle arises. The only resource that seems available for the
MBB to use in putting his visual field in order is correlating it with touch. But
if this is the story, it then becomes questionable whether the various feats of
associative learning Berkeley says the MBB must undertake would be needed.
Work that Berkeley says lies in the MBB’s future would be accomplished as a
result of bringing this initial structure to his visual field. This last point may
be more transparent in the following discussion of perceptual orientation.
What Berkeley Sees in the Man Born Blind 79
For centuries, attempts to determine the physical optics of vision were stymied
because the retinal image is inverted. This was assumed untenable, since the
world does not visually appear upside down. Once Kepler convinced the sci-
entific community that retinal inversion is actually the correct account of the
optics, theorists felt an urgent need to explain how it is, then, that we see
things upright. Vision scientists devoted much time and effort attempting to
find the answer.
Berkeley examines the inverted image puzzle in the sections of the NTV de-
voted to the perception of orientation. He says that understanding his views
on this topic is key to understanding his theory of spatial perception in gen-
eral. Berkeley’s celebrated proposal for dealing with the inverted image puzzle
is to claim that it is bogus. The assumption that the retinal image must some-
how be re-inverted is misguided. It is another case where a conflation of visual
and tangible extensions hampers appreciation of the actual situation. With
proper attention to these matters, the inversion puzzle cannot get off the
ground. The retinal image, being a physical display, is inverted with respect to
our physical body. So Berkeley claims it makes no sense to compare the direc-
tion of the tangible retinal image with arrays in the phenomenal visual field.
Therefore, there is nothing to reconcile.13
Visual extension and visual arrays do not have any location or orientation
in environmental space, neither at the start nor later in life. It is simply a con-
fusion to imagine that the extensions of the two sensory realms are continu-
ous, contiguous, or can share a phenomenal space. It is impossible to combine,
superimpose, or align visual and tangible arrays and compare their relative
orientation. The visual field does not sit atop a background of physical space
that either can determine or provide a fixed point to set its direction. Visual
field arrays have no physical orientation whatsoever. Visual legs are next to
visual earth, but this nextness ordering can not be characterized in terms of
the physical properties of right, left, up, or down.
Explaining how the visible and tangible realms become coordinated re-
mains a genuine problem. It is a problem, however, that arises independently
of the optical inversion of the image on the retina. Berkeley himself has a
story to tell about how vision and touch become coordinated. The correla-
tions are learned.14 Neither a newborn nor the MBB could at first judge the
environmental orientation of what they initially see.
80 Berkeleian View of Vision
Berkeley has no qualms accepting the idea that on gaining sight, the MBB
immediately sees what those with developed visual skills in a strict sense see.
He never says otherwise, and his talk of relations among visual legs, heads,
earth, and sky assumes this is so. Once again, Berkeley’s argument is that in
spite of having their own directional orderings, visual and tangible extensions
are incommensurable. Berkeley’s position is sometimes obscured by his claim
that the MBB could not use number information to aid his cause—for ex-
ample, that two visible legs go with two tangible legs. His point here is that
cardinality measures presuppose a unit of counting. The question “How
many?” cannot stand alone. As he says in NTV, a window, a chimney, a house,
and a city may each be called one, and a picture surface may feel like a single
uniform surface, yet contain many painted shapes in many colors. (See chap-
ter 4.) The MBB, however, has no basis for segregating leg-shaped visual arrays
from the rest of his visual field and no inclination to use “a visual leg-shaped
figure” as a unit of measure.15
If the MBB is assumed to confront the orientation test with an organized
visual field on hand, there could be only two explanations for this initial or-
ganization. Berkeley’s choice among them is clear. The order of visual exten-
sion, like other inherent orderings of sensations, is fixed by the nature of the
sense organs. The alternative account—that useful structure is acquired—is
not a viable option for Berkeley. The MBB has no time to accomplish the task
prior to his gaining sight. And if this objection is skirted, a puzzle still re-
mains. The initial ordering of visual extension would have to be achieved via
correlations with motion and touch. But once these visual and tangible con-
nections are on hand, central aspects of physical directionality would be too.
Thus the MBB would have already acquired directional skills that Berkeley
says he still needs to acquire.16
A Non-Berkeleian Resolution
Gareth Evans’s essay “Molyneux’s Question” is one of the most discussed ar-
ticles on the topic.17 Evans’s paper provides an excellent overview and com-
mentary on assorted versions of the problem and attempts to solve them. He
separates Berkeley’s heterogeneity thesis from claims of innateness and he as-
sumes, with Berkeley, that the blind can have an idea of space as a simultane-
ous whole. He also assumes that at the time of testing the MBB can experience
visual figure, and that figure/surround difficulties, if present, are not the cen-
tral issue. According to Evans, the position of his representative Berkelean,
What Berkeley Sees in the Man Born Blind 81
“B,” is that in spite of the MBB being able to appreciate shapes in visual exten-
sion, he will fail.
After this ground-clearing, Evans goes on to argue that the best way to
bring Berkeley’s real concerns into focus is to reformulate Molyneux’s ques-
tion along the following line: Could a person master shape concepts in the
tangible domain, yet fail to be able to apply them to shapes found in visual ex-
perience?18 According to Evans, if the answer is yes, Berkeley’s position is sus-
tained. If it is no, Berkeley’s negative response to Molyneux is a mistake.
Evans’s argument, in the end, is to challenge the coherence of the claim
that the MBB can be said to see visual figure, yet cannot apply tangible shape
ideas to certain figure-relevant features of his visual arrays. There is, Evans ar-
gues, a conceptual connection between the ability to orient in physical space
and the mastery of visual shape concepts. In particular, upon gaining sight
the MBB cannot be said to appreciate visual figure, unless his new sight expe-
riences are coordinated with appropriate behavioral dispositions or informa-
tion about direction in his immediate physical environment. Without such
visual and behavioral correlations, Evans maintains, the idea that the MBB
has experiences of visual figure is otiose. Evan’s answer to his own version of
the Molyneux question is that Berkeley is not entitled to assume that the MBB
has experiences of visual figure without also admitting that the MBB can as-
sign tangible spatial direction to the visual shape boundaries. This suppos-
edly raises a problem, because Evans is convinced Berkeley does assume that
on gaining sight the MBB experiences visual figure.
Evans, however, does not challenge Berkeley’s full blown theory of spatial
perception. He allows that it is not necessary for specific distances or depth re-
lations to be in place in order to attribute concepts of visual figure. The newly
sighted MBB, may, as Berkeley claims, lack the ability to judge spatial distance
or depth by sight. So the MBB may not actually be aware that the boundary
points of an experienced visual figure lie on a single plane in physical space.
To experience visual shape in Evans’s minimal way, it is only necessary to be
able to assign visual arrays appropriate egocentric direction. The perception
of visual shape requires encoding or representing the egocentric direction of
boundary points in the visual field. Such appreciation of direction in behav-
ioral space, he maintains, is constitutive of the very notion of having visual
shape experience.
Evans offers an analogy. Consider, he says, what it would mean to attribute
mastery of auditory concepts of spatial properties. The test would be whether
the person can employ experiences of sound to guide behavior. The person
82 Berkeleian View of Vision
A Berkeleian Response
the following proposal for coping with the inverted image problem Kepler ex-
posed. The initial visual experience of infants or the MBB has things looking
upside down, and spatial behavior is ill-suited to the environment. Subse-
quent experience establishes visual/tangible correlations that provide the
wherewithal both to invert the way things look and navigate space success-
fully. The visual field has structure from the start—visual legs on the ground,
visual head skyward; nonetheless, initially behavioral responses will be mis-
guided. Studies of people wearing glasses that invert the visual image on the
retina do indicate that something like this is what happens when they are
first put on.
In contrast, behavior may be appropriate to egocentric space, although
visual experience does not jibe with the physical layout. For example, have
someone move her hand up and down the edge of a door. While doing this,
have her don glasses that curve the image on the retina (a straight line proj-
ects a C shape on the retina). Often a subject can continue to move her hand
according to instructions, keeping in touch with the straight door edge, yet
she will report that the door edge looks visually curved. Moreover, sight tends
to dominate touch, and subjects report that their hand tangibly feels like it is
moving along a curved path.
An examination of the literature on perceptual adaptation reveals a host
of fascinating phenomena that are hard to describe, let alone explain. Might
such mismatches between behavior and visual phenomena cause problems
for Evans’ conceptual connection claim? I am not sure. Evans is aware of such
psychological studies of perceptual adaption and the empirical and theoreti-
cal puzzles they raise.19 Evans acknowledges, too, that the issues need more
study. Lacking a fuller statement of Evans’s position on adaptation, I am re-
luctant to push the argument further.
Finally, it is worth noting that Evans’s solution to the Molyneux problem
does not dispute Berkeley’s claims about seeing distance and size. Experiencing
figure in Evans’s sense does not require getting these spatial properties right.
So questions arise whether Evans’s account of figure can be applied to other
aspects of spatial perception and to the other MBB cases Berkeley discusses.
Why is it, though, that Berkeley is so willing to believe that the MBB does
experience figured visual arrays? I think the answer lies in Berkeley’s accept-
ance of a version of the constancy hypothesis. Everyday experience and sci-
entific study seem to reveal that there is a proportionality between features of
the retinal image and features of visual experience. Give or take a little, if the
84 Berkeleian View of Vision
tangible image projected on the retina is straight, the visual array experienced
is a straight line in visual extension. If the retinal image is curved, the visual
array shape changes accordingly. These properties of sensations, Berkeley as-
sumes, are fixed by the sensory system. So if the MBB’s visual system is at the
start in normal working order, Berkeley does not feel it necessary to defend
the claim that the MBB can immediately experience a figured phenomenal
visual field.
My Picture
I also think Evan’s reformulated version of the Molyneux question does not
capture what is primarily at stake for Berkeley. Recall, in my interpretation,
Berkeley can and should accommodate the possibility that the MBB, prior
to being tested, may have generic figure ideas that apply to sight and touch.
In principle, then, the MBB on first gaining sight might be able to apply shape
terms to arrays in both modalities. Nevertheless, figures in the two senses are
experienced as phenomenally distinct sensory ideas. Conflation of visual
experience with tangible experience often misleads. It is very easy to fall into
the trap of taking the comparison of two visual experiences for a comparison
between a visual and tangible experience. We fail to distinguish properly the
visual experience of tangible movement with the tangible sensory content
of the movement itself. For example, we observe someone, perhaps ourself,
running a hand around the perimeter of a dinner plate. We notice the path
the hand takes is circular and conclude that the tangible and visible experi-
ences are qualitatively alike. But this is a conflation. We are not actually com-
paring visual experience to tangible experience. We are comparing visual
experience of a circular object with the visual experience of a hand tracing the
object’s perimeter.
Should the MBB possess generic ideas of phenomenal shape, as argued
above that he may, his passing the Molyneux test can not be ruled out with
certainty. His judgments, though, will depend on considerations of fitness,
not resemblance or necessary connections. “Square,” “circle,” and other ideas
of figure can be given generic definitions that make them conceptually appli-
cable to sight and touch. If the MBB pays attention to these abstract ideas,
they can influence his psychological intuitions of fitness. Two arrays that fall
under the same label may seem more suited to one another than arrays that
What Berkeley Sees in the Man Born Blind 85
Berkeley’s Reticence
I have indicated why Berkeley has reason to be somewhat reticent in his an-
swer to Molyneux. But why is Berkeley not similarly cautious in his other
MBB predictions? I think an explanation of the difference can be found in a
distinction between shape concepts and concepts of distance and magni-
tude, mentioned in chapter 4. Figure is a structural property. Distance and
magnitude, per se, are not. Structural properties of arrays, though, may aid in
cross-modal tasks.
Berkeley says that a visible square may be fitter than a visible circle to rep-
resent a tangible square. It is fitter, because the generic definitions of “square”
86 Berkeleian View of Vision
and “circle” apply to arrays in both domains, and relations among their parts
are structurally akin. Nevertheless, phenomenal square experiences of vision
(color and light) and phenomenal square experiences of touch (pressure)
neither resemble nor are necessarily connected. They are incommensu-
rable. Square visual arrays can not be moved next to square tangible arrays
and compared for shape. We have no idea in either thought or imagina-
tion what it would be to experience a unified figure combining them both. It
is inconceivable.
On the other hand, distance and magnitude in visual and tangible arrays
are not structural properties. The “one point argument” [2] entails that a dis-
tance in visual extension can be a reflection of any distance in physical space.
Similarly, there is no fixed correlation between visual and tangible magni-
tudes. An inch-long object can be experienced as a single minimum visible or
as occupying the whole visual field. This is the problem faced in going from
the flux of sensations to stable perception. Absolute size measures in the vi-
sual array do not support or favor any judgment of physical or tangible mag-
nitude and vice versa.
The situation is different if the task involves relative size estimates. Al-
though arrays of MV and arrays of MT are incommensurable, relational con-
siderations may favor certain cross-modal associations. Confronted with a
pair of objects differing in physical size, it is fitter (psychologically simpler) to
have the tangibly bigger array represented by the larger of two visual arrays.
An appreciation of this fitness can influence the MBB’s decision. The MBB’s
judgment, of course, is not certain. There is no qualitative resemblance or
necessary connection to ensure or underwrite his decision.
Herein, I think, lies the reason Berkeley is more guarded in the case of fig-
ure than he is with other features of spatial perception. When discussing dis-
tance and magnitude in NTV, Berkeley is not concerned with comparative
judgments, where relational facts may influence judgements of fitness. In the
case of shape, structural considerations can not be set aside. Armed with a
generic concept of shape, the MBB might intellectually come to appreciate
that the visible square and the tangible square are structurally similar. This
may bias the MBB’s answer to the Molyneux question in a manner that does
not apply to MBB thought experiments that do not depend on internal rela-
tional properties.
Still, all claims that these structural relations can help with cross-modal
tasks depend on the assumption that the physical items presented are at the
What Berkeley Sees in the Man Born Blind 87
same distance and slant from the perceiver. Altering the distance or spatial
orientation of a physical object will affect its magnitude and figure in visual
extension. Depending on the angle of regard, the visual array of a physical
circle may be elliptical or even a straight line. A tangible square may appear as
a range of visual polygons, as well as a straight line array. And if removed far
enough away a circle or square may trigger no visual experience or visual
experiences that are phenomenally indistinguishable, say two or three MV
each. In discussions of the Molyneux problem it is usually assumed that the
circle and square are both on the same fronto-parallel plane and reasonably
close to the subject.20 Any advantage “fitness” considerations offer depends
on making such assumptions about the location and orientation of the phys-
ical objects being observed. Clearly, there are no conceptual connections that
can apriori assure the MBB of these facts about the environmental layout.
Conclusion
The goal of this paper has been to explicate Berkeley’s views, not defend
them. I do not deny that his heterogeneity doctrine faces difficulties. Set-
ting Berkeley’s work in the context of both historical and contemporary is-
sues in the theory of vision can shed light on points of contention found in
commentaries on his position. What I hope to do in subsequent work is
show how the interpretations presented in chapter 4 and elaborated here,
comport with Berkeley’s Idealism and related epistemological and metaphys-
ical theses.21
Notes
* Unless otherwise noted the numbers in brackets are to the sections in Berkeley’s New
Theory of Vision.
1. E. Mach, The Analysis of Sensations. New York: Dover, 1959, pp. 135–7.
5. See, for example, J. Mueller’s classic statement of the position (excerpted in Percep-
tion). Mueller takes it for granted that the defining qualities of vision are sensations of
color, light, and darkness, although he also maintains that extension is perceivable by
all the senses.
7. I use the term “figure/surround,” not the more common “figure/ground,” in order to
avoid the concerns about three-dimensionality the latter raises. Although considera-
tion of figure/ground issues do play a role in many accounts of the Molyneux problem,
I do not think it crucial to understanding Berkeley’s own views about the MBB. For fur-
ther discussion of the role of conceptualization in early discussions of these matters, see
M. Bolton, “The Real Molyneux Question and the Basis of Locke’s Answer,” in Locke’s
Philosophy. G. A. J. Rogers (ed.). Oxford: Oxford University Press, 1994, pp. 75–99.
9. See G. Hatfield and W. Epstein, “The Sensory Core and the Medieval Foundations
of Early Modern Perceptual Theory.” Isis 70, 1979, pp. 363–84 and R. Schwartz, Vision,
Oxford: Blackwell, 1994.
10. These assumptions were eventually challenged by Gestalt psychologists and then
J. J. Gibson.
12. R. Lotze’s theory of local signs is often read to be an account of the process by which
an ordering is acquired through experience.
13. See Atherton op. cit. For a critique of Berkeley’s position, see L. Falkenstein, “Reid’s
Critique of Berkeley’s Position on the Inverted Image.” Reid Studies 4, 2000, pp. 35–51.
14. Again, questions of innateness and necessary connections are run together. Test-
ing Berkeley’s claims about orientation and learning was a major spur for experimen-
tation with lenses that invert or distort the image.
What Berkeley Sees in the Man Born Blind 89
15. Notice, too, that were visual extension with no appreciable order, there would be
nothing special about the inversion of the retinal image. An un-inverted retinal image,
like images with other orientations on the retina, would pose the same problem.
18. Evans, correctly I think, denies that Berkeley’s negative answer to the Molyneux
question depends crucially on the fact that the original task involves distinguishing a
globe from a cube, rather than a circle from a square. In his analysis, Evans sticks to two-
dimensional shapes.
19. He cites I. Rock’s The Nature of Perceptual Adaptation. New York: Basic Books, 1966,
which provides a penetrating analysis of these issues.
20. The situation is somewhat different with a sphere and cube, since a sphere will
project the same visual array from all orientations. This difference plays a role in vari-
ous accounts of the Molyneux problem, but I do not think it is a major consideration
of Berkeley’s.
21. This requires a treatment of issues removed from those of specific concern to the-
ories of vision.
II Inference
Prescript 6
The question whether perception depends on inference is a very old one that
simply will not go away. I think that a major reason for the persistence of this
controversy lies in the fact that the notion of inference has so evolved in the
study of vision that there is no single idea or empirical position associated
with the claim that perception is inferential in nature. I cannot, today, review
the tangled history that has led us to this stage, rather I would like to sketch
out five broad theses that have come to be equated with the claim that per-
ception depends on inference.
The alternatives that I have in mind are the following:
still, it is maintained that we can really see things in the environment, but
the class of items said to be seeable in this way differs widely on the various
accounts. *[See chapters 8 and 15.]
visual inference. For one way to look at the differences among the five criteria
outlined above is in terms of what each takes as given.
According to the sensation/perception criterion, what is given are sensa-
tions. On criterion 2, what is given are those visual phenomena that show no
influence of learning. On the third criterion, the given is identified with some
particular characterisation of the stimulus or the information contained in
the stimulus. On the mental operations criterion of inference, the given is the
first state in the process that is deemed to be psychological, as opposed to
being simply physical or physiological in nature. With the epistemological
criterion the given is what can be “really” seen. Each criterion, then, distin-
guishes between something given and that which goes beyond or is inferred.
The accounts differ over where to draw the line as to what counts as the data
to the visual system, but they each assume there is a unique line to be drawn.
I, however, see no principled way to make such a distinction, no way, that
is, to draw a principled distinction between what is given to us and what is our
contribution, a result of our supplementation. For the notion of our supple-
mentation, like the notion of the given, is nether firm nor fixed. Indeed, each
of the inference criteria we considered can be seen as spelling out a different
understanding of what constitutes our supplementation. On the first crite-
rion, there is supplementation when one idea triggers or otherwise leads to
another. On criterion two, supplementation occurs when the perceptual phe-
nomenon is the result of learning. With criterion three, supplementation is
what we provide over and above what is contained in the impoverished stim-
ulus. According to the fourth criterion, supplementation is a matter of opera-
tions on mental states or representations. Finally, the epistemological criterion
considers any perceptual judgement or experience to involve supplementa-
tion whenever it does not come up to the theorist’s particular standards of
epistemological purity.
The ideas of the given and supplementation march in tandem. What is given
is that which does not require our supplementation, and what is supple-
mented is that which we are not given. The problem is there is no one correct
way to draw these boundaries. In different contexts, for different purposes,
and to highlight different contrasts, it may be useful to settle on one inter-
pretation rather than some other. From the standpoint of the empirical study
of vision, however, we can make no general, non-arbitrary sense of the idea of
the input or the data of vision.
100 Inference
What does it mean for vision to involve operations that are distinctively
mental? In early works on vision this notion was often cashed in either in
terms of the manipulation of conscious ideas (such as sensations leading to
perceptual states) or in terms of learning. In more recent times, especially
with the rise of cognitive psychology and the development of computers and
computer models of cognition, the push to identify the mental with con-
sciousness or learning has largely diminished. But willingness to widen the
concept of the “mental” has only led to further complications in character-
izing the notion of “visual inference.” For as vague as these earlier ideas
may have been, nothing as circumscribed as consciousness or learning has
emerged to take their place as marks of the mental.2 What is more, if inference
is equated with mental operations in general, rather than with some specific
type of mental processing, then each widening of the notion of the “mental”
automatically generates an additional construal of “visual inference.”
Less obvious, but perhaps more significant, once the notion of the “men-
tal” is freed from its anchor in consciousness and learning, the very sorts of
intuitions that originally led many theorists to equate inference with mental
operations tend to be undermined. For the important point that these theo-
rists wished to make (or reject) was that vision involved higher-level, thought-
like states and processes, or that vision was affected by past experiences and
memory traces in the very way in which thought was supposed to be in-
fluenced. Vision, that is, involved the mind and mind-like intentional or
experiential states. The problem is that the extended characterizations of psy-
chological processing that have grown out of work in cognitive and com-
puter science often do not match up readily with these older conceptions of
what mental participation is taken to involve.
The issue emerges clearly in Shimon Ullman’s influential paper “Against
Direct Perception.”3 In this paper Ullman argues that we should consider
perception direct or immediate (and hence not inferentially mediated) if the
processes that transform stimuli into percepts can only be elaborated or ex-
plained in physiological terms. “If the extraction of visual information can be
expounded in terms of psychologically meaningful processes and structures,
then it can not be considered immediate.”4 Now although he gives no precise
specification of what constitutes decomposition of an operation into psy-
chological, as opposed to physical, constructs (other than that the character-
ization uses concepts found in psychology, not physiology), he is clear that
102 Inference
Notes
This paper is based on a much larger work on perceptual inference. In order to fit within
the time allotted, I am going to have to skip many of the details and much of the sup-
porting arguments. What I present here are just the main themes of that longer work.
*[VVBT.]
The Role of Inference in Vision 105
1. Although I tend to use the terms “mental” and “psychological” interchangeably, the
concepts are not equivalent for all theorists.
2. Various of my subsequent points about the lack of fixity of the notion of “visual
inference” are related to the current discussion regarding consciousness and “the”
time and place of conscious events (see Daniel Dennett, Consciousness Explained. Little,
Brown: Boston, 1991). Tracing these connections would take us far afield from the
present study.
3. Shimon Ullman, “Against Direct Perception,” Behavioral and Brain sciences, 3 (1980),
pp. 373–415.
4. Ibid., p. 374.
6. Ibid., p. 380. Ullman’s suggestion (ibid., p. 374) that the distinction between what
can and cannot be decomposed may be “relative to the system under investigation”
and “expresses a point of view” about “one’s domain of interest” would seem to fit with
views I develop concerning the optionality of the inference/non-inference dichotomy.
7. See my article “The Problems of Representation,” Social Research, 51 (1984), pp. 1047–
64. The issue has become even more otiose with the development of connectionist
models of cognition and debates over whether these models appeal to “real” represen-
tations. See Paul Smolensky, “On the Proper Treatment of Connectionism,” Behavioral
and Brain Sciences, 11 (1988), pp. 1–74, and the subsequent criticisms, countermoves,
and counter-countermoves.
Prescript 7
Near objects may partially obscure far objects; the converse is never true. Hence the
mind seizes’ upon the interruption of one object at the boundaries of another as a cri-
terion of the relative distance of the two objects. The interrupted object is farther away.
The circumstances attending the discovery of this principle are lost in antiquity.
Boring (1942, p. 264)
Interposition—the cutting off of part of the view of one object by another—is an ex-
traordinarily potent cue to relative distance. The partially occluded object is always
seen as behind the nearer object.
Kaufman (1974, p. 230)
When one object partly occludes another, the occluding object is perceived as closer
and the occluded object as further.
Palmer (1999, p. 236)
If an opaque body intercepts a line of sight, it prevents light rays from any-
thing behind it reaching a viewer’s eyes. Given minimal assumptions about
light taking a straight path, it follows that any item so occluded must be far-
ther from the viewer than the interposed opaque body itself. Thus occlusion
(also referred to as interposition, superposition, or overlap) seems to carry im-
portant and unequivocal information about the spatial layout. Moreover, it
seems to provide this unambiguous depth information in any direction and
over any distance in which visual perception functions (Cutting and Vishton
1995). Whether near or far, straight ahead or off to a side, it is always the case
that if the occluding object {O}, actually occludes an object {A} from a subject’s
{S} view, A is farther from S than O. It is not surprising, then, that occlusion
has long been taken to be a major cue for depth perception. Nor is it surpris-
ing that occlusion has been thought to be one of the artist’s most effective
110 Inference
The Dilemma
Consider the most trivial case, where A is small enough and so located that O
occludes it completely. In this circumstance, there will be nothing of A for S
to see, and no O/A contour information for S to register and use in reaching
an occlusion judgment. So unless there is some other source of information
to indicate A’s presence, O’s occluding A will prevent S from seeing or being
visually aware of A. Total occlusion is obviously more a hindrance than an aid
to relative depth perception.
Next, consider an effect interposition may have when O occludes only part
of A, leaving the rest visible. As figure 7.1a shows, occlusion of A by O may
lead to A’s being perceived further from S than when it is not occluded. But as
figure 7.1(b) shows, occlusion may cause A to be seen nearer to S than before.
These bidirectional effects on the perception of A’s distance need not be con-
sidered a problem, of course, since occlusion is only claimed to furnish ordi-
nal depth information. Phenomena like those figures 7.1 a and b exhibit do
not challenge the idea that the occluding object itself is always nearer than
the object occluded
But is it true that the occluding object is always nearer than the object oc-
cluded? The apparent a priori status of this claim trades on an ambiguity
(a)
O
A
(b)
Figure 7.1
Making Occlusion More Transparent 113
Changes in an observer’s angle of regard with respect to O and A can also af-
fect relative depth and its perception.5 Standing squarely in front of a paint-
ing hung on a wall, both the section of the wall the painting occludes and the
observable sections of the wall on either side of the painting are further from
S than the painting. If S moves enough to one side, however, the wall on that
side may be closer to S than the painting, and can be veridically perceived as
such. Or consider a knife stuck in an opaque object. The tip of the knife is oc-
cluded by the embedding surface. Depending on S’s angle of regard, the vis-
ible knife handle may be and will usually be seen by S to be closer than the
occluding surface. More generally, surfaces of attachment provide constant
obvious examples where the visible parts of A are and are perceived to be
closer to S than O. Viewed from in front, Corin’s house occludes portions of
the ground immediately behind it. The ground surface lying immediately in
front of the house, nevertheless, is perceived as nearer than the occluding
edge of the house.
Whatever limitations edges or contour boundaries have in supplying
depth information about the visible part of A, it may seem safe to assume that
it provides definitive depth information about the part of A lying within the
occluding border. Obviously, this claim, too, must be tempered. The infor-
mation occlusion borders make available is entirely local to the boundary. The
most such contours entail is that if A continues on, A is behind O at that very
point of superposition. Beyond that, occlusion at an edge does not imply any-
thing about the location of the remaining parts of A, within the boundaries of
O. They may and may be seen to emerge at any place through, above, or be-
low O. In summary, environmental layouts where the visible parts of A are
closer than O is to S are ubiquitous, and people tend to have no trouble seeing
the relative depth relations correctly. Alternatively stated, the visible part of
the incomplete, non-continuous, irregular outlined A is often closer to S than
the complete, continuous and regular outlined O and will be so seen.
Responses
Another idea a reader floated is to claim that occlusion can and does pro-
vide useful depth information, but only when (1) A is at a significant distance
behind O at the occluding border and (2) the information occlusion affords is
limited to those visible parts of A not far from that border. Now the optics and
geometry of (1) and (2) do ensure that this claim is correct or at least proba-
bilistically correct—cases of transparency, discontinuous objects, and non-
generic alignments are the exceptions. The problem with solutions such as
this is that they involve a circularity similar to the one Gibson warned of.
What evidence can S have for assuming that A is a significant distance be-
hind O at the occluding border? By definition, the occluded part of A is out of
sight. So it cannot be a source of information that A is far behind O at this
point. It is the visible part of A that must play the role. To serve its purpose,
S has to determine visible A’s depth with respect to O. But then S will have
already discerned the depth relations in question (the relative depth of O
and the expanse of A that can be seen) independent of information gleaned
from occlusion.
Analysis
The reason interposition effects on depth perception are varied need not be a
mystery. Placing O in a position to occlude A has a range of consequences. It
alters the availability and interpretation of information coming from other
stimulus variables (for example, height in field, texture gradients, attachments,
slant indicators, etc.) that are relevant to perceiving depth. (See figure 7.1.) In
turn, the effects occlusion has on the perception of spatial relations will
neither be uniform nor unidirectional.
The physical occlusion of parts of one object by another is to be found
everywhere we look. Indeed, every three-dimensional opaque object hides all
but its own facing surface. Therefore, the visual system constantly interpo-
lates, a-modally completes, rounds out, and fills in its visual world. Contour
boundary information is one significant goad or stimulus for such supple-
mentation. It is misleading, however, to think that simply distinguishing the
occluder from the occluded provides a unidirectional indicator or source of
information about the relative depth relations of their visible parts.
It goes without saying that visually supplemented content must be placed
or situated somewhere. When contour information prods the visual system
Making Occlusion More Transparent 117
to supplement the scene, the relative depth of the a-modally completed part
of A to the occluding O can not be left undetermined. It comes along for the
ride. Perceptual construction must assign it a location. Hence, it is tautologi-
cal that a supplemented occluded item is perceived to be behind its “occlud-
ing” O. Where else could it be?
Supplementation, though, can have the opposite effect on perceived depth
relations. When what is supplemented is seen and opaque, it will be an oc-
cluder and not the occluded. For instance, in cases of apparent motion or sub-
jective contours (figure 7.2), the seen interpolated perceptual content often
does the occluding. That a supplemented visible surface is not itself occluded
goes without saying. This claim, too, is tautological.
Considerations such as these would seem to indicate that it is better to think
of edges and contour boundaries as stimuli for supplementation rather than as
providing independent information for judging depth. The depth relations,
after all, are of necessity determined by the nature of the supplementation. In
suggesting this coupling of depth, supplementation, and occlusion, I do not
wish to suggest that there is a causal order among them or that they are separate
phenomena. The phenomena are two sides of the same coin. Figure 7.3 pro-
vides an illustration of what I have in mind. If line (a) is perceived as lying on
the frontmost plane, it occludes (b) and (b) is a-modally completed at the point
of intersection. If the perception switches and line (b) is seen on the frontmost
plane, it occludes (a) and (a) is supplemented at the place where they cross.
These perceptual reversals though each occur as a package deal. When the per-
ceived depth relations change, so do the experiences of supplementation and
occlusion. Or one might equally hold, when occlusion and supplementation
Figure 7.2
118 Inference
Figure 7.3
relations change the depth relations perforce change with them.6 As Gibson
says, “The visual superposition or overlapping of surfaces . . . is an important
type of depth perception, not a cue for depth perception” (p. 228).
Conclusion
Nowhere have I argued that edge, contour boundaries, and other informa-
tion resulting from optical occlusion have no role to play in depth percep-
tion. The point is that its effects are complex and not unidirectional. As
Boring correctly remarked, the intuition that occlusion is a strong cue to
depth relations traces its history back to antiquity. Nevertheless, its empirical
and theoretical significance remain to be seen.
Notes
* I wish to thank James Cutting, Heiko Hecht, Larry Mahoney, and Tim Shipley for
comments.
1. The analysis in this paper is limited to occlusion in static scene perception. Related
issues concerning accretion or deletion phenomena that occur with movement are not
discussed. I believe the analysis does have implications for these dynamic cases, but it
would unduly complicate matters to deal with them here. Note, too, motion based ac-
cretion and deletion, per se, have no part to play in picture perception (See readings in
section III).
2. For the use of ordinal information to derive more metric information see Shep-
ard (1980).
3. See also Ratoosh (1949) for an earlier indication of similar misgivings and Landy,
et. al. (1995) for more recent qualms.
4. Gibson sees his analysis of occlusion as part and parcel of his overall project of show-
ing that perception is direct. Gregory (1990), on the other hand, claims that occlusion
and related phenomena show that perception is indirect. My own view (see chapters 6
and 8), is that there is nothing much to be gained by entering into this controversy.
120 Inference
5. The problems here are quite similar to those explored in my account of size percep-
tion (Schwartz 1994) once slant is factored in.
References
Cutting, J. and P. Vishton, (1995). “Perceiving Layout and Knowing Distances: The in-
tegration, relative potency and contextual use of different information about depth.”
In Perception of Space and Motion, W. Epstein and S. Rogers (eds.). San Diego: Academic
Press, pp. 69–117.
Gibson, J. J. (1950). The Perception of the Visual World. Boston: Houghton Mifflin.
Kaufman, L. (1974). Sight and Mind. New York: Oxford University Press.
Landy, M., L. Mahoney, E. Johnston, and M. Young, (1995). Measurement and Mod-
eling of Depth Cue Combination: In Defense of Weak Fusion. Vision Research 3,
pp. 389–412.
Levine, M. and J. Shefner, (1991). Fundamentals of Sensation and Perception (second edi-
tion), New York: Pacific Grove: Brooks-Cole.
Ratoosh, P. (1949). “On Interposition as a Cue for the Perception of Distance.” Proceed-
ings of the National Academy of Science 35, pp. 257–259.
J. J. Gibson’s theory of direct perception sets the stage for most current dis-
cussions of perceptual inference. Gibsonians deny the need to appeal to in-
ferential processes in each of the guises spelled out in chapter 6. (For their
particular conception of the processes of learning, see J. J. Gibson and E. J.
Gibson, “Perceptual Learning: Differentiation or Enrichment,” Psychological
Review 62 (1955), pp. 32–41.)
James Cutting is especially sensitive to the ambiguities and unclarities with
the notion of “inference” encountered in the writings of both direct and in-
direct theorists. In a series of papers, Cutting tries to sharpen the terms of the
debate, in order to give it more empirical content. He proposes as well his own
model, one that he labels “directed perception.” Chapter 8 examines Cut-
ting’s analysis of the problem of inference and the contribution his directed
model can make to settle it. In spite of the interesting empirical and theoreti-
cal features of Cutting’s account, doubt remains that his proposal can give
substance to most ongoing disputes over perceptual inference.
8 Directed Perception
Background
Perhaps the most debated topic in the theory of vision has and continues to
be the question whether perception is direct or indirect. Although the issue
has a long history in both the philosophical and psychological literature,
it took on new dimensions and significance with the pioneering work of
James J. Gibson. Beginning with his book, The Perception of the Visual World
(1950), Gibson argued that progress in the theory of vision had been and
was being hampered by an impoverished, atomistic conception of the
stimulus. The central problem of perception was taken to be that of ex-
plaining how we come to see the world on the basis of the limited infor-
mation contained in the point values of light striking the retina. Gibson
demonstrated that if this elementaristic view of the stimulus is abandoned
and attention paid to higher-order properties of the retinal image, espe-
cially ratios and invariants in the light array resulting from movement, the
information available for perception is greatly expanded. In turn, Gibson
maintained that this richness of information made it possible to see the envi-
ronment directly. Contrary to received opinion, there is no need for a subjec-
tive mental contribution by the perceiver to mediate and hence stand in the
way of our access to reality. We can simply see the objects and properties in
the environment.
Nowadays, Gibson’s ideas concerning the importance of higher-order
properties of the stimulus to the study of vision are not in doubt. What has re-
mained most controversial and most contentious is Gibson’s further claim
that an expansion and reconception of the available information shows that
perception is direct.
124 Inference
A third alternative
questions not only about the interpretations of (i) and (ii) but of the actual
relevance of such theses to the study of perception.
In describing his own position Cutting allies himself with the Gibsonians,
rejecting the idea that perception involves a mental contribution and thus is
indirect. Cutting’s grounds for this initially seem stronger than Gibson’s. Gib-
son argued that there is no need for the perceiver to “go beyond the given” be-
cause there is sufficient information in the stimulus to specify the layout.
Cutting adds that, in many situations, there is not only sufficient informa-
tion, there is an overabundance of it.
As the continuing controversy indicates, these Gibsonian-inspired claims
that the stimulus is adequate for specifying the layout and that this adequacy
means that perception is not indirect, have not proven compelling. Elsewhere
I have argued that such failures to settle the dispute are only to be expected,
since, as commonly conceived, the very distinction between direct and indi-
rect perception has no clear content or empirical import (Schwartz 1994).
Attempts to give the distinction real bite depend and flounder on vague
intuitions about the nature of the mental or intentional, dubious assump-
tions about consciousness, and inadequately-motivated characterizations
of notions such as “the given,” “stimulus impoverishment,” “transducers,”
and the like.
Cutting is sensitive to many of these issues. He appreciates the need to for-
mulate the idea of “stimulus adequacy” in more precise terms (see next sec-
tion). And in sharp contrast to most writers, he recognizes that an appeal to
the notion of “inference” cannot by itself serve to separate direct from indi-
rect approaches. With little or no alteration, all of the competing theories,
his own included, can be (re)described as inference models (Cutting 1991a).1
Nonetheless, Cutting believes there is a significant difference between direct
or directed theories and indirect theories. A theory is indirect, he says, if it
holds that cognition plays a role in perception. For Cutting, though, the char-
acterization of perceptual tasks and accomplishments in inferential terms
does not show that they are cognitive. More is required. Cognition is impli-
cated only if the premises involved in the inference are “in the mind.”
But what does it mean for a premise to be “in the mind”? Traditionally, the
idea of something’s being “in the mind” was understood to mean accessible to
126 Inference
Stimulus adequacy
inferential processes and that these steps are necessary in order to go beyond
what is given. It is a mistake, nevertheless, to assume that this claim is equiv-
alent to or entails that the stimulus is insufficient to specify the layout. To see
this, consider the situation with so-called “taking-account” models of size,
shape, or brightness perception. These models are usually considered para-
digm cases of indirect perception. (See Epstein 1973; Rock 1983.) Yet in these
cases the information relied on can be sufficient for veridical perception.
For example, the taking-account-of-distance model of size perception de-
pends on the fact that the size of the retinal image varies with the distance of
the object from the observer. This relationship is specified by the formula: im-
age size = object size/object distance.3 Proponents of the model maintain that
size perception results from a calculation (or inference) according to the re-
ciprocal psychological formula: perceived size = image size × perceived dis-
tance. Information about image size and distance are assumed available in
the retinal image and from other cues, such as the convergence of our eyes
in fixating the object. On this model, perception of size is not “direct” in the
sense that it depends on the prior registration and taking-account of non-
size information. *[See chapter 2.]
At the same time, given image size and distance information the psycho-
logical equation provides for a unique veridical evaluation of size. So the re-
lationship between layout and information is one-one. What is more, this
information, like the higher-order ratios and invariant properties cited by di-
rect theorists, can be characterized in terms of causal or lawlike connections.
The relationships among convergence angles or distance and object size,
angle of regard, and image size are subsumable under optical laws. The situa-
tion is much the same with the taking-account models of shape and bright-
ness. For that matter, similar points about causal or lawlike connections could
be made with respect to various of the pictorial and kinesthetic cues ordinar-
ily associated with theories of indirect perception.4
ting’s more expansive Gibsonian theory allows that there may be several such
invariants each of which completely specifies the very same environmental
property. The existence of one or multiple invariants would, of course, be cru-
cial to the debate over indirect perception if reliance on higher-order features
of the array, as opposed to lower-order features, implied that no processing
took place or that the processing that did occur was purely “non-mental” (e.g.
Runeson 1977). But neither of these claims follows.
Determining density gradients, cross ratios, horizon ratios, etc. (i.e. higher-
order stimulus properties) may require, and theorists like Cutting permit,
complex computations. (See also Sedgwick 1980.) Furthermore, the claim
that the stimulus information is sufficient or over-sufficient for determining
the layout does not show that perceivers need not “process” these richer
sources of data. Finally, Cutting and his co-workers are willing to describe this
processing in terms of inference, computation, and selection. But these are
just the sorts of notions many theorists claim mark out the domain of “cog-
nitive” processing. (See Ullman 1980.)
In contrast to indirect theorists, Cutting’s more orthodox Gibsonian critics
reject the directed model primarily on the grounds that the stimulus features
Cutting cites should not to be thought of as “information” in Gibson’s sense.
The real information in the stimulus is a still higher-order property shared
by all of the features Cutting isolates (Burton and Turvey 1990; Stroffregen
1990; Pittenger 1990; Cutting replies in 1991b.). By identifying the available
information with this single property, and not individually with Cutting’s
assortment of invariants, they are able to hold onto their claim of one-one
correspondence between the information and the layout.
This conception of “information” is supposed to be applicable even when
more than one perceptual system or modality is involved. For example, per-
ceiving time-to-contact of a projectile may depend on acoustical as well as op-
tical invariants, but the information for such perception is to be understood
as a single higher-order pattern of them both. In other cases it is held that the
“informational” invariant is not to be identified with any external stimulus
but with a single invariant stimulus to tissue or neural structures that lie be-
yond the initial receptors.
Now there are some serious difficulties involved in finding plausible singu-
lar stimulus properties of the kind required to accomplish such reductive
analyses. But this is not the central reason for questioning Cutting’s critics’
mandate for a unitary specification of the stimulus information. The major
130 Inference
A quite different kind of challenge to the directed model denies what until
now has been allowed, namely that the experimental findings actually show
that perception depends on combining redundant information in the way
the model proposes. The recent theories and work of Gilden (1991), Gilden
and Proffit (1989), Massaro (1987, 1988), Massaro and Cohen (1993), Runeson
(Runeson and Vedeler 1993) and Cutting (Cutting et al., 1992) all speak to as-
pects of this issue. Gilden claims that although the sorts of lawlike kinematic
information Cutting and other Gibsonians isolate is available, perceivers do
not use this data. They employ instead heuristics that rely on less systematic,
less dependable cues to the layout. Vision is more of a hit or miss operation
with the visual system taking advantage of whatever features of the situation
it assumes salient to the problem at hand. Gilden likens his view to Ramachad-
ran’s (1990) anti-Gibson, anti-Marr, “bag of tricks” approach to perception.
Massaro’s dispute with Cutting is different. He does not object to Cutting’s
claim that perceivers make use of overly rich geometrical or kinematic infor-
Directed Perception 131
Metaphysics
Conclusion
Acknowledgments
I wish to thank James Cutting for discussing these issues with me. I also wish
to thank the journal referees, John Heil and Edward Reed, for their comments.
Notes
1. See Fodor and Pylyshyn (1981) for a widely-cited version of the more standard op-
posing view.
2. There are, of course, those, like Searle (1992), who insist on identifying the mental
with actual or potential conscious awareness. One of Searle’s major complaints with
current cognitive science is its failure to adopt this criterion of the mental.
3. Technically this equation holds strictly only for cases where the object lies on a
plane perpendicular to the perceiver’s line of sight. There is no need to go into these and
other complications here.
4. Veridical perception of metric, or what are sometimes called “absolute” spatial prop-
erties would depend on assumptions of a scaling factor. But this is also true when the
information relied on are higher-order Gibsonian stimulus properties.
5. Runeson’s views are even more closely associated with those of G. Johansson. The
differences between Gibson and Johansson need not concern us here. See Runeson
(1977) and Gibson (1977).
References
Bruno, N. and J. Cutting, (1988). Minimodularity and the perception of layout. Journal
of Experimental Psychology: General 117, 161–170.
Burton, G. and M. T. Turvey, (1990). Perceiving the length of rods that are held but not
wielded. Ecological Psychology 2, 295–324.
Cutting, J. E. (1986). Perception with an eye for motion. Cambridge, MA: MIT Press.
———. (1991a). Why our stimuli look as they do. In G. R. Lockhead and J. R. Pomerantz
(eds), The perception of structure (pp. 41–52). Washington, DC: American Psychological
Association.
———. (1991b). Four ways to reject directed perception. Ecological Psychology 3, 25–34.
———. (1993). Perceptual artifacts and phenomena: Gibson’s role in the 20th century.
In S. C. Masin (ed.), Foundations of perceptual theory (pp. 231–260). New York: Elsevier
Science.
136 Inference
Cutting, J. E., N. Bruno, N. P. Brady and C. Moore, (1992). Selectivity, scope and sim-
plicity of models: a lesson from fitting judgements of perceived depth. Journal of Exper-
imental Psychology: General 121, 364–381.
Fodor, J. and Z. Pylyshyn, (1981). How direct is perception? Some reflections on Gib-
son’s ‘Ecological Approach.’ Cognition 9, 139–96.
Gibson, J. J. (1950). The perception of the visual world. Boston: Houghton Mifflin.
———. (1977). On the analysis of change in the optic array. Scandinavian Journal of Psy-
chology 18, 161–163.
Helmholtz, H. (1968). The origin of the correct interpretation of our sensory impres-
sions. In R. Warren and R. Warren (eds.), Helmoltz on perception: its physiology and devel-
opment (pp. 249–260). New York: Wiley.
Massaro, D. W. (1987). Speech perception by ear and eye: a paradigm for psychological in-
quiry. Hillsdale, NJ: Erlbaum.
Massaro, D. W. and M. M. Cohen, (1993). The paradigm and the fuzzy logical model of
perception are alive and well. Journal of Experimental Psychology: General 122, 115–124.
Pittenger, J. B. (1990). The demise of the good old days: Consequences of Stroffregen’s
concept of information. ISEP Newsletter 4, 8–10.
Searle, J. (1992). The rediscovery of the mind. Cambridge, MA: MIT Press.
Stroffregen, T. (1990). Multiple sources of information: for what?, ISEP Newsletter 4, 5–8.
Ullman, S. (1980). Against direct perception. Behavioral and Brain Sciences 3, 373–415.
III Picture Perception
Prescript 9
An old and ingrained tradition has it that what makes a picture a representa-
tion is resemblance between the picture and what it represents. A picture of
Nelson Rockefeller represents Rockefeller and not John Lindsay because it
resembles the former and not the latter. The trouble with this traditional
view is that it is difficult to interpret it in a way that makes it both true and
informative.
Obviously, resemblance is not a sufficient condition for representation. Two
pictures of Rockefeller may resemble each other more than they resemble
Rockefeller, yet it’s the man they represent. Similarly, one of Rockefeller’s
brothers may look more like him than any portrait does, but his brother doesn’t
represent him. Representation requires that one object refer to (stand for,
be about, be a symbol for) the other, and this “semantic” relationship is not
guaranteed by resemblance.1
If resemblance is not a sufficient condition for representation, still the idea
lingers that it must be necessary. For isn’t it resemblance that distinguishes
pictorial reference from mere denotative reference? Isn’t what distinguishes a
picture of Rockefeller from the name, “Rockefeller,” or the description, “the
governor of New York in 1972,” the fact that only the first symbol resembles
him? The view that resemblance, while not sufficient, is a necessary condi-
tion for a picture to represent does have its appeal, but it also has its short-
comings. The problem is that in any of its more interesting applications the
resemblance relation marks no simple or fixed relationship among objects.
X may resemble Y with respect to property P1 and not property P2 and Z with
respect to P2 and not P1. And no advance is made in claiming that two things
resemble each other, if or to the degrees that they share properties, since any
two things have the same number of properties in common. Attempts to give
independent criteria for resemblance in terms of geometrical or topological
144 Picture Perception
resent, it is thought that the relationship between pictures and their referents
must be arbitrary, like that between words and their denotata. That “cat” de-
notes cats is an arbitrary decision, and the language would not in any way be
seriously altered if “cat” were used to denote tables and “table” to denote cats.
Since what each word denotes is a matter of convention, we must learn each
individually. Presented with some new word, we will not know what it de-
notes unless we are taught its use. But surely, it is felt, such arbitrariness is not
a feature of pictorial systems. We couldn’t just as well decide to let a picture of
Rockefeller denote Lindsay without seriously altering the kind of symbol
system at hand. Furthermore, we needn’t be taught what each new picture
represents as we must have explained to us what each new word means.
Therefore, the relationship between pictures and their referents could not be
conventional, like that between words and their denotata. The referential or
descriptive significance of pictures must after all be due to resemblance.
But then theorizing about pictorial representation is stalemated. The psy-
chologist feels that unless he appeals to resemblance, certain psychologically
important distinctions between pictures and words are obscured. Yet, the no-
tion of resemblance is itself so problematic, that it cannot serve to get an ad-
equate explanation off the ground. The situation calls for a re-examination.
What is needed is a way to relieve the pressure of the dilemma that does not
itself require an uncritical appeal to resemblance.
As I have sketched it, the dilemma is based on two assumptions. The first is
that if pictures do not resemble their referents, then the connection between
the two must be arbitrary, in the way the connection between a word and its
denotation is. The second assumption is that the attribution of arbitrariness
conflicts with the fact that we can understand new pictures and not new
words. But it takes little examination of other types of symbol systems, and
how we might go about mastering them, to see that the assumptions under-
lying the dilemma are unfounded. For consider a system like standard West-
ern music notation. Given only a suitable sampling of written notes (symbols)
and taught to correlate them with sounds (referents), we might very well
learn how the system works, how to go on. Getting the idea of how the sys-
tem works enables us to handle new symbols in the system not included
among the teaching samples. I am not talking here about new combinations
of previously learned notes, but of understanding new, hitherto unheard in-
dividual notes. And such learning can occur, it would seem, without our ever
receiving explicit instruction concerning the structure of music notation.
146 Picture Perception
Yet, there is no reason to suppose that the written notes look like or resemble
the sounds they denote. Or, similarly, consider a gauge that correlates bright-
ness of display light (symbol) with temperature of object (referent). Presented
with enough instances of these correlations, we may learn how the system
works. And once we know how the system works, we can interpret an un-
bounded set of new symbols. Again, resemblance between symbol and refer-
ent would seem to play little role. *[Inductive learnability is comparable to
the notion of “systematicity” as it is discussed in theories of language and
thought. The claim that inductive learnability implies compositionality is
much less plausible in the case of pictorial representation.]
Indeed, we can find examples of this sort of inductive semantic learning in
natural language, too. Indicator terms, metaphor, and number vocabulary
provide three different areas where a relationship exists among the symbols
so that learning the reference of some words enables us to project the seman-
tics of the others correctly. Although tokens of the indicator word “here” dif-
fer vastly in their denotata, we learn to understand new tokens on the basis of
our experience with the old. The same is true of our ability to understand
brand new metaphors. Our habits associated with the literal use of the word
put sufficient constraints on metaphorical use, so that we can frequently in-
tuit the semantic import of the metaphor the first time around, without being
taught it specificially. Finally, it would seem that ordinary number vocabu-
laries also have this learnability feature. We might learn to use the cardinal
numbers properly by being given enough examples until we get the idea of
how numerals are concatenated so as to measure the cardinality of a set.
In none of these cases does systematic correlation of the set of symbols with
their referents depend in any obvious way on resemblance. Nor does it de-
pend on being able to define or specify the semantics for the new symbols
within the resources of that part of the system already mastered, or, for that
matter, within the resources of the entire system. And if lack of resemblance
entails that the system is conventional, then all these systems are conven-
tional. Still, the symbols within a given system may not be arbitrary with
respect to the other symbols in the system, for, there may be sufficient regu-
larity among the symbols, regularity in how they denote or describe, so that
learning to use some provides adequate evidence for interpreting other mem-
bers of the set. The difference between the set of words “cat,” “table,” “ink,”
etc., and “1,” “2,” “3,” . . . “10,” “11,” “12,” etc., or music notation is not that
members of the first set fail to resemble their referents, while members of
Representation and Resemblance 147
the latter two sets resemble theirs. Nor is the difference that the first set is
conventional and the other two are not. Rather the difference is that “cat,”
“table,” “ink,” etc., are arbitrary relative to each other, while with the number
vocabulary or music notation there is some systematic regularity among the
symbols affecting the way their interpretation are assigned. This regularity,
of course, is not a priori or non-conventional. “21” could have been used to
denote 99 membered sets rather than 21 membered sets and “ ” might
have been chosen to denote C # rather than G. The point is that, given the way
the system does work, with the correlations that have been established and
do exist, we can learn the semantic force of some members of the system from
learning the semantics for others. Arbitrariness is not a question of conven-
tionality, but more a question of induction and learning. We see the assign-
ment of symbol to referent as arbitrary when we can discover no pattern that
enables us to project the semantic import of the symbol from knowledge of
other symbols in the system.
A symbol can be arbitrary then in the sense that it is a matter of convention
or choice or not a priori that it denotes what it does, but this differs from say-
ing that its interpretation is arbitrary with respect to the other symbols in the
system. It does not in any way follow that if symbols do not resemble their ref-
erents, the symbols need be arbitrary with respect to each other in the way
“cat,” “shoe,” and “ink” are. That we can understand what a new picture rep-
resents, therefore, does not entail that the picture bears some absolute or fixed
resemblance relationship to what it represents. All that is required is that there
be a discernable pattern of usage within the pictorial system, so that learning
what some pictures in the system represent provides the appropriate experi-
ence for learning what new pictures in the system represent. If this is so,
much of the pressure forcing us back to the traditional view is relieved.4
Another obstacle remains, however, to thwart attempts at overturning the
traditional view. Our account of the ability to understand new symbols sug-
gests that we learn directly some correlations of symbol to object, and this
enables us to know how to deal with other symbols whose semantics have not
been directly given. But many theorists maintain that the ability to compre-
hend pictures requires no learning, at least not any that can be viewed as in-
struction or practice in interpreting pictures. So, it is thought an important
psychological difference remains between these “learnable” systems and pic-
torial systems. Pictorial systems require no learning, and the only way to ex-
plain this is to allow that pictorial systems are based on resemblance. This
148 Picture Perception
push toward the traditional view has force, however, only if we grant both
that we do not have to learn how to understand pictures, and that resem-
blance could provide an account of this fact.
But, theoretical considerations cast doubt on the initial no-learning claim,
as well as on the idea that resemblance, reasonably construed, could explain
it. For earlier, we noted that resemblance is not a sufficient condition for
representation. So, even if resemblances were not relative to skills, interests,
theory, perceptual abilities, etc. and discerning resemblances required no
learning, some instruction would be needed to determine when and how
things function as representations—for example, that Rockefeller’s picture
under normal circumstances represents him, and that his brother does not
represent him. Even if we discount this problem of how we acquire the ability
to attach symbolic significance at all to pictures, other features of the situation
make it very unlikely that we can completely rule out some form of symbol
learning. Perhaps the simplest feature we could point to is that while pictures
in standard Western pictorial systems are by and large two-dimensional, we
interpret their referents most usually as three-dimensional objects. So al-
though a picture of Rockefeller will resemble his frontal surface at least as much
as it resembles him in entirety, it is a representation of a three-dimensional
man and not a picture of a cross-sectioned man. Similarly, a profile picture of
Rockefeller will show but one eye, yet it does not represent him as half-headed
or one-eyed.5 However, if untutored resemblance is all we had to go on, it
would seem that the profile will resemble a half-headed being seen from the
side just as much as it does the full-blown Rockefeller seen from the same po-
sition. And it is difficult to see how our adjusting to these features of standard
pictorial representation could be accomplished without some sort of learning.
Examination of the empirical evidence available does not force the no-
learning claim upon us either. Indeed, most of the data concerning this issue
is anecdotal and highly equivocal. On the one hand, there is some anthropo-
logical evidence that people belonging to tribes unfamiliar with Western rep-
resentations do not understand photographs when first presented with them,
and experiments by Hudson and more recently Deregowski seem to indicate
that people inexperienced with Western art are initially confused about depth
relationships characterized by drawings in standard perspective.6 On the
other hand, there are some reports of immediate recognition of photographs,
and there is at least one experiment indicating that an untutored child can un-
derstand pictures the first time around.7 In this latter case, the experimenters
Representation and Resemblance 149
clear sense in which it can be said that all the clues and cues themselves re-
semble their objects. Many of the arguments outlined at the beginning of this
paper would seem to apply equally well to claims that shadows resemble their
objects or that the foot under the cover resembles the pattern of blanket folds
that indicate its presence.
Of course, to suggest the importance of transfer learning is not to pro-
vide argument or evidence for it as an account of pictorial skill. However, my
point is that if effects of transfer are considered, the significance of evidence
brought forth to support the no-learning claim is further obscured. And if the
no-learning claim is weakened, one more pull toward the traditional resem-
blance account of pictorial skill is also weakened.9
In challenging the fruitfulness of resemblance theories, I have not attempted
to offer an alternative account of pictorial competence. Nonetheless, if the ar-
guments presented above are correct, a somewhat different emphasis in ap-
proach would seem indicated. Instead of our concentrating exclusively on
the relationship between picture and object, more attention should be paid
to the relationship among symbols within the given system, to see how and if
learning some of the symbols plays a role in enabling us to comprehend the
significance of other new symbols in the system. Similarly, we might explore
how competence in one style of pictorial representation influences or pro-
vides the basis for understanding another style. For example, in what way,
if any, does understanding caricature depend on mastering normal pictorial
systems? More stress too should be placed on discovering the possible facili-
tating effects of skills and principles developed in our use of other non-
linguistic symbol systems such as gestures, imitations, imagery, sensori-motor
or enactive schemes, etc. Would damage to or inability to master these systems
be reflected in difficulty with pictorial systems? And, perhaps most impor-
tantly, we should look for ways in which particular pictorial systems may take
advantage of our ordinary habits of perception, cue detection, pattern recog-
nition, etc. How, for example, may our normal skills at distinguishing figure
and ground be used to parcel out portions of a picture into figure and ground?10
Perhaps deeper understanding of these issues will, in turn, shed light on
the perennial puzzle of realism in art. What is it that makes a picture realistic?
One argument has been that realism is to be accounted for in terms of the
identity of the bundle of light rays reflected from a realistic picture and those
rays reflected from the object it represents. Now, no one need deny the optics
of the situation—that some pictures viewed under certain very stringent
Representation and Resemblance 151
conditions will reflect the same bundle of light rays as their objects viewed
under specified conditions. However, as Goodman, Pirenne, and others have
noted, the identity of light rays thesis can have little to do with ordinary pic-
ture perception. For the identity position requires that we view the picture
and object one-eyed, through a peephole, with the eye stationary, and these
surely are not the usual conditions under which we look at pictures and make
judgements about their realism.
An alternative account, put forth by Goodman in Languages of Art, is that
once we give up the idea that resemblance is a necessary or sufficient condi-
tion for representation, we can come to see that realism is more a matter of
habituation and familiarity. “Realism is relative, determined by the system
of representation standard for a given culture or person at a given time.”11 On
this account, realism is a matter of ease of interpretation. What makes a Rem-
brandt portrait more realistic than a Picasso Cubist painting is that the Rem-
brandt is in a system whose principles of interpretation are ingrained, the
principles are second nature. But in order to interpret the Picasso, “we have to
discover rules of interpretation and apply them deliberately.”12 It is most fre-
quently felt, however, that this analysis of realism distorts certain important
features of perception. For it is claimed that no matter how familiar we are
with the particular Picasso painting, or how second nature interpreting cubist
pictures becomes, such pictures will not seem realistic (or at least nowhere
near as realistic as a Rembrandt). Our judgments of realism are just not as flex-
ible as the familiarity view would appear to require.
Now I believe that there is something to this criticism of the familiarity ap-
proach to the problem of realism, but that a consideration of some of our
points about learning may supplement the position and make it more palat-
able. This supplementation, however, is not intended to provide a definition
of realism. Nor is it meant to provide criteria for making fine distinctions
among pictorial styles or for constructing a precise ordering of degrees of re-
alism. The rough principles to be offered are perhaps necessary conditions for
realism but are clearly not sufficient. They are suggested only as a way to over-
come the “anything goes” conclusion—the claim that with familiarity any
picture could be as realistic a picture of X as any other—that is seen to follow
from a pure familiarity account.
I would suggest that one characteristic of systems of representation usually
taken as standards of realism is that they are inductively learnable or more
easily so than other systems. Having been taught to interpret several cubist
152 Picture Perception
pictures, we are less able to project to the correct interpretation of new cubist
pictures than we are if given examples of impressionist paintings, and then
required to interpret a new impressionist picture. With very abstract styles
such projection would be even harder than the cubist case, whereas the tran-
sition from one photo-realist painting to another might be even easier than
in the impressionist case. So, among pictorial systems, degree or ease of learn-
ability may correlate with our intuitions of realism. While related, learnabil-
ity, in our sense, may be separated from ease of interpretation. For example,
the set of numerals 1–1000 may be more learnable than a set of one thousand
arbitrary words like “cat,” “ink,” “table,” etc., although, once having mastered
both sets, it is as easy to understand or interpret “ink” as it is the number “97.”13
It seems plausible that another characteristic of realistic systems of picto-
rial representation is that they make better use of habits and processes of per-
ception that we have developed for dealing with ordinary objects. Thus, as
indicated above, Hochberg has been examining the possible relationships be-
tween the processes involved in scanning edges and those involved in per-
ceiving realistic line drawings. Similarly, certain means of rendering distance
on a two-dimensional surface may readily tap perceptual processes under-
lying ordinary distance perception. For example, it is known that superposi-
tion or overlapping serves as a cue to distance; when one object hides another
the object hidden is judged to be further away. A system of representation that
likewise hides or blocks out the more distant object might thus be able to make
use of one of our well-ingrained habits of three-dimensional distance percep-
tion. *[But see chapter 7.] Modes of representing brightness are another case
in point. It is well known that it is impossible to have the absolute brightness
of a picture viewed under gallery conditions equal that of, say, the sunny field
of which it is a study. But it has also been established that brightness percep-
tion is affected by other stimuli and cues than absolute brightness. In partic-
ular, the ratio of the object’s brightness to that of other nearby objects seems
to have an overwhelming effect. Representational systems that take into ac-
count the importance of relative brightness to brightness constancy might
thus be better able to exploit our existing visual habits and skills than just any
old system of correlating pigment with brightness. And while there is noth-
ing in principle to preclude a system of representation in which a color repre-
sents its complementary or in which a color is correlated with size, such
systems need not significantly tap the processes of cue detection, scanning,
constancies, etc., that we employ in determining the color or size of the ob-
jects we observe around us.
Representation and Resemblance 153
Notice, however, that to argue for such transfer of skills is not to return to
the identity of light rays thesis; nor is it to claim that there are no differences
between the processes involved in perceiving objects and those needed to in-
terpret pictures. All that is required is that certain two-dimensional systems
of cues and ways of rendering space, shape, color, size, and light take better
advantage of our ordinary perceptual skills than other systems. If this is so,
then given the processes by which we do see objects in the world, systems that
can tap these existing skills and habits will be considered relatively realistic.
Those systems that require new and separate skills of interpretation, where
there is little transfer from ordinary perception, or where there is interference
with these habits, will be considered less realistic.
These suggestions are not meant so much as a challenge to the familiarity
account of realism as they are a supplementation. The learnability and trans-
fer features could be offered as partial explanations why interpreting some
systems seems second nature, and why in dealing with other systems we have
to apply rules of interpretation more deliberately. Also, this supplementation
would provide some basis for explaining why our judgments of realism are
not as flexible as a pure familiarity, “anything goes,” view might require. For
no matter how familiar or at ease we are with a particular picture or system,
its principles of interpretation may be at odds with our normal processes of
object perception. To the degree that this is so, we will not find pictures in the
system realistic. It should be noted, however, that we do not really know
how physiologically fixed or flexible all these perceptual processes are them-
selves.14 Nor do we know if or to what extent experience looking at pictures
may influence our more usual processes of object perception. *[See R. Schwartz;
“The Power of Pictures,” Journal of Philosophy, LXXII, (1985), 711–20.] And, of
course, the more relative and flexible our visual system is, the more relative
and flexible will be our standards of realism.
Perhaps, the essential difference between the pure familiarity view and my
supplementation is best seen as one of emphasis. The familiarity advocate, in
his account of realism, stresses the importance of our experience with the
most common or prevalent kinds of representations around us. The habits of
perception acquired in learning to comprehend these systems set the standard
for realism. The more a system requires new skills of perception and interpre-
tation that differ from or interfere with the processes underlying our ability
to comprehend familiar systems of representation, the less realistic it will be
judged. On my account, the emphasis is shifted. Throughout the day most
of us spend our time viewing not pictures, but a world of three-dimensional
154 Picture Perception
objects. My suggestion is that the habits, processes, and skills underlying our
perception of these more ordinary objects serve as a touchstone for assessing
realism in pictures. The deliberateness, lack of second-nature, etc. associated
with non-realistic systems may be traced, in part, to the fact that they require
skills of interpretation differing from those involved in the use of our visual
system to perceive our everyday environment.
Finally, the tentativeness of all these suggestions about learning, transfer,
interference, etc. must be stressed again. Just how ordinary perceptual expe-
rience might facilitate pictorial understanding, which sorts of systems might
be aided and which hindered, why some tribes unfamiliar with Western rep-
resentation seem to have initial difficulty with photographs and drawings in
standard perspective are only some of the open questions requiring system-
atic study and experimentation of the sort not presently available.
Notes
* A version of this paper was read at the University of Pennsylvania; Annette Barnes
commented on my talk, and I benefited much from her remarks. I should also like to
thank Margaret Atherton, Joan Ganz, and Nelson Goodman for their comments.
1. For more on this issue see N. Goodman, Language of Art (Indianapolis, Bobbs Merrill
Co., 1968), Chap. 1 and M. Black “How Do Pictures Represent?,” in Art, Perception, and
Reality, ed. M. Mandelbaum (Baltimore, Johns Hopkins University Press, 1972). An ad-
equate account of pictorial reference, however, is not at hand, and any such treatment
would be much more complicated than this paper might seem to indicate. While I rec-
ognize that some of my remarks (e.g. about the reference of portraits) need patching up
to avoid error, I believe my main psychological points can be made without a more
subtle and refined treatment of these matters. *[The issues parallel those in the philos-
ophy of language concerning the relationship of names to descriptions.]
3. For a discussion of this issue see N. Goodman, Languages of Art, pp. 225–232.
4. The distinction between systems having patterns in their interpretive schemes that
allow for inductive learning and those that do not may itself be a relative matter de-
pending on what other skills, discriminative powers, categories of classification, and
symbolic competencies are available. So learnability too may be more a matter of de-
gree than a fixed property of systems. In any case, it should be obvious that the distinc-
Representation and Resemblance 155
tion between “learnable” and “arbitrary” systems I have been proposing is not meant
to distinguish pictorial from non-pictorial symbol systems. Music notation and num-
ber vocabularies, I have suggested, both have this learnability feature, and, I take it, nei-
ther are representational systems.
5. For further consideration of this issue see P. Ziff, “On What a Painting Represents,”
Journal of Philosophy, 1960, Vol. 57, pp. 647–654.
8. See, for example, J. Piaget, The Origins of Intelligence in Children, (New York, Interna-
tional Universities Press, 1952) and numerous other of his publications; J. Bruner et al.
Studies in Cognitive Growth (New York, John Wiley and Sons, 1966).
9. Hochberg and Brooks themselves adopt a similarly cautious view toward their data.
For example, they suggest that part of pictorial competence may develop as a result of
the more general process of learning to perceive space.
10. See pp. 69–73 of J. Hochberg’s recent paper, “The Representation of Things and
People,” in Art, Perception, and Reality, where he speculates about how experience with
the world of objects, particularly the scanning of edges, might provide occasion for
developing skills appropriate for dealing with line drawings.
13. Again, I am claiming that a comparatively high degree of learnability may be nec-
essary for the realism of systems, I am not maintaining that it is sufficient or that other
characteristics may not weigh more heavily.
14. For example, the extent to which various constancies are physiologically deter-
mined as opposed to being learned or the extent to which they might be changeable
once an initial learning period has taken place are not settled matters.
Prescript 10
Introduction*
When psychologists who study vision turn their attention to picture percep-
tion, they find themselves entangled in a web of puzzles. There is, moreover,
no consensus and much confusion on how to resolve these matters experi-
mentally. As a result, research on picture perception is in an uneasy state.
When these same vision theorists turn their attention to Nelson Goodman’s
(1968) work on pictorial representation, they are highly critical. They are con-
vinced his ideas are at odds with well-established facts. I think there is a con-
nection between these two phenomena.
In brief, I believe Goodman and the vision theorists adopt strikingly differ-
ent paradigms concerning the nature of pictorial understanding. Their dis-
agreements, in the end, are less over the empirical data and more over the
appropriate interpretation of the facts. At the same time, I believe the para-
digm vision theorists do adopt is responsible for many of the puzzles they
encounter. In what follows, I will use “symbolic paradigm” to refer to the ap-
proach of Goodman and his followers, and “projective paradigm” will serve
to label the dominant paradigm of perceptual psychologists.
Grouping vision theorists in this way all under one rubric is, of course, a
simplification. There are dissenters in the field who favor the symbolic model
and other researchers who find neither model acceptable. In addition, there
are significant differences among projectivists in the accounts of picture per-
ception they champion. I think, however, these latter differences are mainly
due to differences in their models of perception in general. The differences do
not indicate rejection of the projective paradigm’s core conception of the na-
ture of picture perception.
160 Picture Perception
The basic idea of the projective paradigm is that seeing pictures involves the
same psychological processes and mechanisms as seeing anything else in the
world. In a sense this claim is trivial, since pictures are themselves physical
objects in the world. The central projectivist claim goes further. Projectivists
maintain that in an important psychological sense, seeing a representation of
an object is like seeing the object itself.
Now in the case of seeing objects in the environment, the problem of per-
ception may and is often conceived as being one of “inverse optics.” Optics
determines the projection of light rays from objects to the retina. In order to
perceive the layout correctly, the perceiver must reverse the process. The per-
ceiver somehow projects back from the retinal image, or the information con-
tained therein, to the object from whence it came.
Vision theorists differ widely on how to explain this process. There is no
agreement on the proper description of the stimuli, on the information avail-
able in the retinal image, on whether or what calculations are involved in re-
covering the scene from the image, and on much else. These are the sorts of
differences, alluded to above, separating theorists who, nonetheless, adhere
to the projective paradigm of picture perception. Where the paradigm’s pro-
ponents agree is in assuming the propriety of adopting their favorite model of
inverse optics to picture perception itself.
The guiding principle of the paradigm can be presented with the aid of Al-
berti’s Window, a method for constructing realistic pictures. As illustrated in
numerous treatises on art and perception, the method requires placing a
window between the artist and the scene to be depicted. The artist’s task is to
produce a picture that will duplicate the light rays at the point where they in-
tersect the window on their way to the artist’s eye. If a picture so constructed
is then substituted for the window, it will project the same bundle of light rays
to an observer’s eye as the original object—as long, that is, as the observer re-
mains at the artist’s original location, the so-called “station point.” All this is
simply a matter of optics. *[See chapter 11, figure 11.3.]
According to the projective model, as the artist sees through Alberti’s win-
dow to the object, so the viewer of pictures “sees through” the picture surface
and locates the represented scene in space. There is a continuity, so to speak,
of the virtual space depicted and the environmental space perceived. “Seeing
Pictures, Puzzles, and Paradigms 161
through” is like “seeing” the real scene except the source of the stimulus is
not direct.
Implications
Puzzles
conflict between the two-dimensional cues of the picture’s own surface and
the three-dimensional pictorial cues. In some way the visual system must re-
solve such cue conflicts in order to perceive pictures. But how is this done?
On this matter there is little agreement. Various theorists propose models
in which the perceiver suppresses or ignores the two-dimensional informa-
tion. Others favor models which combine the two- and three-dimensional
cues forming a compromise perception of the represented space. Another
approach is to assume PURE picture perception is exhibited when or to the
extent the two dimensional cues are eliminated or not available. As with
the physicist’s “frictionless surfaces” or “isolated systems,” only in appropri-
ately idealized set-ups is it possible to get at the real processes underlying the
mechanisms at work. I think the enormous experimental literature on pic-
ture perception involving monocular vision and other reduced viewing con-
ditions, or in trompe l’oeil situations where the two-dimensional cues are
ineffective, attests to the influence of these ideas.
Of course, things get much worse once more realistic viewing conditions
are considered. For it is not simply the presence of two-dimensional cues that
raises a problem. In most everyday situations, people are not located at the
station point when viewing pictures. Unfortunately inverse optics applied to
the retinal images a picture makes available from these other viewpoints does
not project to the same scene or layout it does from the station point. Off the
unique station point the stimulus array a picture affords is said to be distorted.
This, though, raises deep questions about how perception can work when the
stimuli are abnormal and hence misleading.
Such distortions would pose less of a problem if perception were itself dis-
torted in the way inverse optics predicts. And as Gombrich (1972) has pointed
out, many theorists have adopted this “curious myth.” A myth, Gombrich
notes, because it flies in the face of ordinary experience. Pictures do not look
terribly distorted when we move off the station point.
These days, few theorists maintain a very strong distortion thesis. It is gen-
erally admitted, for example, that a picture of the Cologne Cathedral is per-
ceived, by and large, as representing the same view and shape of the building
whether the picture is looked at from the station point or from a side. This
fact, the resistance of perception to distortion, is attributed and referred to as
the “robustness” of perspective.
Robustness, while perhaps welcomed by the painter or photographer, is
quite bothersome to the projectivist. For how can perception be robust when
Pictures, Puzzles, and Paradigms 163
Symbolic Paradigm
pictorial cues. The point is obvious in the context of other forms of symbol-
ization. The sentence “Cologne is on the Rhine” makes a claim about the
environment, and in this sense has three dimensional significance. We do
not, however, think the cues informing us of the sentence’s status as a two-
dimensional written symbol in any way conflict with the three-dimensional
interpretation of its content. The symbolic paradigm suggests a similar ac-
count may be offered for perceiving pictures. We perceive a two-dimensional
pictorial symbol as having three-dimensional significance.
Along similar lines, the symbolic approach may offer help with the distor-
tion/robustness problem. Consider a sign bearing the sentence, “The Cologne
Cathedral is just ahead.” The sentence is about the Cathedral and offers in-
formation about its location. There is nothing perplexing, though, how this
sign can be taken to represent these spatial relations when the sign is viewed
from the side instead of straight-on. The stimuli and visual experiences of the
written sentence may change somewhat as we move about, but within limits
we perceive the shapes of the letters correctly. Veridical perception of the writ-
ten sentence, the representation, is all that is required to assess its content or
meaning properly.
The symbolic paradigm suggests a similar approach to picture perception.
A picture of the Cologne Cathedral may depict it as at a particular distance
and having a particular size and shape. It makes no difference to this repre-
sentational content whether the picture itself is viewed straight-on or from
off its station point. True, the stimuli the picture affords change as we move
about, and the perceptual experiences of the picture may differ to an extent.
Yet, within limits, it is possible to perceive the shapes and relationships of the
picture pretty much as they are. And that is what it takes to comprehend the
picture’s representational content.
The evolutionary dilemma projectivists confront is also given a new twist
on the symbolic model. The locus of the problem is shifted, along with pos-
sible approaches to its solution. The paradigm suggests treating the issue not
in isolation but in the context of other forms of symbolization. There is, for
example, much controversy about the correct evolutionary account of the
human language capacity. Yet no one supposes our ability to understand the
meaning of written sentences is a deep problem for an evolutionary account
of vision. Language comprehension depends on mastering the interpretive
principles of the system. The failure of written words to replicate projectively
what they represent does not stand in the way. Our ability to understand
166 Picture Perception
Given all the help the symbolic paradigm seems to offer the perceptual psy-
chologist, why the reluctance to accept it?
I think this is primarily due both to a misreading of what the symbolic par-
adigm claims and to a prevalent assumption about the nature of vision. I will
look at these each in turn.
Projectivists believe because the symbolic paradigm claims pictures func-
tion like languages, the model must and does claim pictures are languages.
Projectivists, however, are convinced empirical evidence shows the mecha-
nisms involved in reading pictures, and the routes leading to the develop-
ment of this skill, are not the same as those underlying the ability to read
linguistic texts. Thus they find the symbolic paradigm untenable. (Such com-
plaints are repeated over and over in criticism of Languages of Art.) These
complaints, though, rest on a misconception. The symbolist admits, indeed
insists, depictional and linguistic systems differ in syntactic and semantic
principles. Reading pictures, therefore, is not identical with reading words.
But symbolists find here no basis for abandoning their paradigm. After all, as
the above discussion makes clear, perceiving pictures typically is “not exactly
the same” as perceiving the real three-dimensional environment. What’s
more, the simple dichotomy of symbol systems into pictures and languages
is much too blunt. It leaves no obvious place for a range of other symbolic
forms, maps, models, diagrams, music notation, and a whole lot more. The
Pictures, Puzzles, and Paradigms 167
The sway of this idea is reflected in the importance attached to claims that
young children, or adults from distant cultures, comprehend perspective
pictures without instruction.
This attempt to underwrite the core intuition also runs into difficulties.
First, there is much dispute over the proper interpretation of the data on un-
tutored picture perception. Second, evidence for untutored comprehension
of perspective pictures must be understood in light of evidence showing
comprehension of cartoons, caricatures, and other kinds of non-realistic de-
piction may likewise not require explicit training. Third, in contemporary
theories of vision the learned/innate distinction does not pair up with the
cognitive/non-cognitive dichotomy supposedly underlying the core intu-
ition (Schwartz 1994).
Finally, contrary to prevalent assumptions, I do not think the focus on
learning truly gets at the heart of the projectivist’s intuition. For suppose
Latin were innate and required no learning to understand. The projectivist
would still want to maintain Latin should be grouped with languages and not
pictures. And the rationale would remain as before. Language comprehension
is a two-stage process, seeing the words and then mentally interpreting them.
Perceiving pictures is supposedly different. It is a one-stage process not requir-
ing interpretation. We simply “see through” pictures to the worlds they repre-
sent. There is no need for a second stage of interpretation.
Visual theory may explain seeing words, but surely it is no part of visual the-
ory to account for how we determine what words represent. In contrast, it is the
job of vision, not mind, to perceive what pictures represent. Which pictures?
Well only perspective pictures, the rest are to be lumped with languages.
The above account of the competing paradigms, I believe, sheds light on the
uneasy state of research in picture perception. Usually in work on vision the
symbolic framework is disregarded, for the problems it raises are thought to
lie outside the scope of perception. If understanding a picture is like under-
standing a sentence, it is not a job for the visual scientist to investigate. At the
same time, the highly circumscribed set of issues and domain the projectivist
countenances make for a dubious research program. The projectivist studies
only perspective pictures and only up to the point where vision ends and
cognition begins. This puts the visual theorist in a bind.
Pictures, Puzzles, and Paradigms 169
To treat a flat painted surface as a picture requires more than seeing it as a col-
ored object of a particular size, at a certain distance and direction. It must be
perceived not simply as an object in the world but as a representation. Here
commitment to the projective paradigm gets in the way. Inverse optics does
not readily accommodate many of the important aspects of picture percep-
tion highlighted by the symbolic paradigm. And this I believe is a major rea-
son for the uneasy state of research in picture perception. For stripped of
“interpretation,” of “reading,” of the accretions of experience and all else that
constitutes or contributes to referential and representational significance, a
picture cannot function to guide behavior, inform cognition, or enhance aes-
thetic experience. Or in Goodman’s terms, the projective paradigm has trouble
accounting for the role pictures play in making and remaking our worlds.
Note
* This paper is based on ideas further explored in “Two paradigms of picture percep-
tion: The uneasy state of research on picture perception,” Report de Forschungsgruppe:
Perception and the role of internal regularities of the physical world am Zentrum fuer inter-
disziplinaere Forschung der Universtaet Bielefeld, 1997.
References
Gombrich, E. H. (1972). “The ‘What’ and the ‘How’: Perspective Representation and
the Phenomenal World,” in R. Rudner and I. Scheffler (eds.), Logic and Art, Indianapo-
lis: Bobbs-Merrill, 129–149.
Goodman, Nelson. (1960). “The way the world is,” Review of Metaphysics 14, 48–56.
Schwartz, Robert. (1994). Vision: Variations on Some Berkeleian Themes, Oxford: Black-
well Publishers.
Prescript 11
In recent papers (1997, 2002) I have explored how two seemingly conflicting
paradigms inform the conception and study of picture perception. The dom-
inant paradigm, one especially favored by vision theorists, claims that seeing
a pictorial representation of an object is, with qualifications, like seeing the
object itself. The picture, being a geometrically sanctioned projection of its ob-
ject, resembles it, or otherwise serves as a mimetic surrogate, “re-presenting”
what it depicts (Danto 1982). Accordingly, pictorial representation is at its
best when, as in trompe l’oeil paintings, viewers can not tell the picture, the
stand in or substitute, from the real thing.1 An alternative paradigm, the
symbolic model, championed most forcefully by Nelson Goodman (1968),
focuses attention on syntactic and semantic features of pictures. On this
account, pictures are importantly allied with other forms of representation,
including languages, maps, and music notation, and picture perception is to
be understood in this context.
In my earlier work, I attempted to show how adopting the symbolic ap-
proach could provide a framework for explaining several persistent problems
in the study of picture perception—a topic I will return to later. I also main-
tained that vision theorists’ reluctance to embrace this approach often rests
on a misunderstanding. Although the symbolic paradigm does stress that
pictures, as representations, function like languages, it does not claim they
are linguistic symbols. The model, in fact, insists there are significant syntac-
tic and semantic distinctions between linguistic and pictorial systems.
Critics of the symbolic paradigm, nevertheless, remain skeptical, tending
to resist efforts at a rapprochement. The symbolic model, they say, fails to cap-
ture a core intuition about pictorial representation, its “visuality.” Picture
perception is a matter of vision, whereas comprehending languages and other
symbol systems depends on cognition. Being stand ins or re-presentations,
174 Picture Perception
Figure 11.1
Figure 11.2
as a graph, the thickness of the line, its color, and background have no sig-
nificance. Interpreted as a picture, all these properties go to constitute the dis-
plays representational force. Notice something phenomenal, akin to a Gestalt
switch or aspect change, occurs when shifting between the two readings. And
experience of the line takes on another much different character when read
in the context of figure 11.2.
Simply making a graph more replete, however, will not turn it into a pic-
ture. Nor will assigning representational significance to the background do
the trick. For if these additional features of the graph stand for measures of
temperature, mass, and electrical charge, the display will still lack the “visu-
ality” associated with pictures. What more is needed?
Perhaps if a display is to function as a representational picture, the entire
surface of the display must have spatial significance. Each point, whether
marked or blank, is to be understood as mapping onto a spatial place.4 On this
account, the more replete graph just described will not count as a picture. The
thickness of the line now has significance, but it represents degrees of heat
not spatial locations. When, instead, the line is read as a picture of a moun-
tain range, the dimensions of the line, as well as the surface points above and
below, even the blank ones, take on spatial meaning. A list of numerical triplets
denoting spatial coordinates, though, is not a picture. It has the right kind of
176 Picture Perception
significance for picturing, but lacks the appropriate analogue density and
repleteness to function pictorially.5
Requiring each picture point to have spatial significance does not mean
that spatial layout is the main or most important information pictures con-
vey. Pictures can and do represent much more, including non-spatial proper-
ties. My claim is only that the seeming visuality, or in Wollheim’s (1974) terms
seeing in, aspect of picture perception lies in the extent to which a picture sur-
face is given a spatial reading.6 It is in this way that seeing pictures is like or re-
sembles the everyday perception of real objects and scenes. Normally when
we look about the environment, the points comprising our visual field are
each given spatial location. Although, here too, assignment of spatial loca-
tion does not exhaust the information vision provides. And as with picture
perception, the placements in space may be imprecise, relative not absolute,
and even indeterminate.
Does this added requirement of a spatial reading collapse the distinction
between the surrogate and symbolic paradigms? I think not. Important dif-
ferences remain. In order to function as a representational picture, the sym-
bolic model now does require assigning or “projecting” spatial significance to
the display. But it does not require that a picture be a projection from any ob-
ject to the picture surface. This offers technical advantages in accounting for
the referential or denotive features of pictures—problems Goodman and many
philosophers find of considerable concern. For example, it enables the sym-
bolic model to sidestep difficulties in dealing with fictional representation
(e.g. unicorn pictures) and general representations (e.g. pictures accompany-
ing dictionary entries), cases where there are no actual objects from which
the pictures are projected.
The symbolic paradigm can also handle cases of misrepresentation in a
more natural manner than the surrogate model. Just as a sentence describing
Bill Gates may be inaccurate, so a picture of Gates may incorrectly character-
ize him. The picture, faulty as it may be, refers to and depicts Gates. It repre-
sents Gates; it does not represent some other person the picture might better
copy or resemble. Alternatively, multiple prints of a woodcut of Gates repre-
sent the man, not the other strikings. This in spite of the simple identity pro-
jection from one print to the next.
Multiple representations and misrepresentations, along with fictive and
general representations, make up a large part of the pictures we encounter.
And as Goodman has argued, satisfactory treatment of these cases is impor-
Vision and Cognition in Picture Perception 177
tant if we are to understand the role pictures play in informing the mind and
guiding behavior. This would seem to require attention to the referential fea-
tures of picturing, features anti-interpretivist approaches tend to ignore.
More significantly for present concerns, the surrogate paradigm faces vari-
ous perceptual problems the symbolic approach can more readily avoid. The
symbolic paradigm has no need to maintain that only one or a small circum-
scribed group of optically sanctioned projection schemes is required for pic-
torial representation. Nor need the model presume there is a singular, visually
correct way to depict space. Surrogatists, however, seem committed to the
idea that what distinguishes pictures from other forms of representation is
that pictures resemble, copy, or otherwise serve as visual stand ins for what
they represent. At the same time, surrogatists hold that only certain kinds of
projective displays, primarily those constructed according to the rules of lin-
ear perspective, render space mimetically. Only these renderings depict the
world as it is seen, non-conventionally describing what they re-present.
But then, Egyptian, Haitian, and Cubist renderings are problematic, as are
cartoons and caricatures. They, along with much else found in museums and
magazines, are in some sense not “genuine” pictorial representations, since
they do not look like, re-present, or provide the same cues or stimulus infor-
mation as the objects they depict. Such renderings are not full-fledged stand
ins. They, perhaps, are better understood along the lines of languages, maps,
and graphs as arbitrary, conventional representations. For they do require in-
terpretation to be understood, and in so doing likely cross the visual/cognitive
border. Accordingly, such pictures are not appropriate to take the place of real
objects or layouts in psychological experiments. Photographs and depictions
done in perspective pretty much make up the domain [Schwartz 2002]. Per-
spective pictures form a “natural kind” among representations and thereby
constitute a natural kind for vision science.
The symbolic paradigm is under no similar pressure to relegate Egyptian,
Haitian, Cubist, cartoon, and caricature representations to linguistic or quasi-
pictorial status. The syntactic and semantic properties of these representa-
tions serve to group them with other sorts of pictures, as well as distinguish
them from linguistic systems of representation. And the added requirement
of a spatial reading attempts to account for the particular “visuality” repre-
sentational pictures possess.
In contrast, surrogatist theorists’ grounds for distinguishing pictures from
non-pictorial representations and for placing Egyptian, Haitian, Cubist,
178 Picture Perception
Station
point
Figure 11.3
Vision and Cognition in Picture Perception 179
Figure 11.4
(a) (b)
Figure 11.5
These days a converse station point phenomenon has also been receiving a
lot of attention. Although perspective renderings are by and large robust, the
perception of certain features of some perspective pictures do not remain con-
stant when a viewer moves laterally with respect to the picture surface. It has
been long known that the eye gaze of a depicted person will often appear to
follow a viewer as he or she moves left and right. The Mona Lisa is a classic ex-
ample of this phenomenon. And the famous World War I “I Want You” poster,
in which Uncle Sam’s finger appears to point directly at viewers no matter
where they are standing, is another prime example. Perceptual experiences of
real faces and fingers, however, do not alter in these ways in response to ob-
server movements. Real things do not follow you about. Instead, different
portions of the object or scene come into view. So again there is a discrepancy
between picture perception and ordinary perception that needs to be squared.
Surrogate theorists also must acknowledge and explain why some pictures
drawn in correct perspective, nevertheless, do not look right to viewers. For
example, representations of spheres toward the periphery of a scene are found
more acceptable, appear less distorted, if they are drawn as circles not ellipses.
Yet, real spheres, so located with respect to an observer, project elliptical not
round images on the retina (Pirenne 1970). In addition, people frequently fail
to notice anything amiss with pictures that violate the canons of linear per-
spective. Most viewers sense nothing strange or distorted, nor find it difficult
to understand engineering drawings done according to a scheme of isometric
projection, a system in which parallels perpendicular to the picture plane do
not converge. And it takes time and often instruction for many viewers to
appreciate the “distortions” in Cezanne’s or Van Gogh’s renderings of space.
These assorted phenomena of picture perception not only pose empirical
challenges to the surrogate paradigm, they go some way in undermining its
rationale. They make it harder to sustain a very strong claim that perception
of pictures does not tap resources beyond those employed in seeing the ordi-
nary physical environment. Some additional help must be recruited. Fur-
thermore, to remain within the spirit of the surrogate paradigm, this help
should be “visual” not “cognitive.”
While these station point and related phenomena do require explanations,
they do not pose the same or as pressing a problem to the symbolic approach.
If pictures are understood as allied with other forms of symbolization, their ro-
bustness in response to alterations in viewpoint might be expected. No one is
surprised that words maintain their significance when looked at from varying
182 Picture Perception
angles. There is no difficulty, as long as the letters are not thereby distorted
and incorrectly perceived. And talk of a canonical, non-distorting, or correct
point to view words seems strained. Similarly, it may be held that a suitable
reading of a picture depends on seeing the picture itself, the representation,
correctly. If, as with letter recognition, this can be done with reasonable accu-
racy from different locations there is no reason why the relevant assignment
of spatial meaning to points on the picture surface should be compromised.
An account of the effects of motion might follow the same line. The expan-
sion rate of the retinal image associated with moving toward a printed word
does not alter how the word is understood. The resulting changes in the
stimuli or look of the word are accorded no representational significance. The
story might be the same with pictures. The fact that near and far depicted
items expand at the same rate does not importantly alter perception of the
picture surface. So it does not alter or distort our appreciation of the sizes and
distances represented.
This approach to station point problems gains support from and fits in
nicely with the more inclusive conception of pictorial representation the sym-
bolic paradigm promotes. In focusing on the robustness of perspective ren-
derings, there is a tendency in vision studies to overlook the fact that viewing
angle and distance also have little effect on perceiving pictures that violate
surrogatist criteria. The perception and understanding of cartoons and cari-
catures, as well as Egyptian, Haitian, and Cubist paintings are robust. Their
representational significance, like that of pictures done in linear perspective,
remains constant with changes in viewing angle and motion. Yet the very
idea of a station point may be as otiose with many types of non-mimetic pic-
tures as it is with words. Then again, I think the phenomena of perspective ro-
bustness would not itself appear so puzzling, if less significance were accorded
the view and geometry associated with this singular point.
Admittedly, perception of the picture surface is not always constant. The
experienced shifts in direction of depicted eye gaze, fingers, and other objects
that accompany movement are quite noticeable. Comparable movement
produced changes in perceived orientation do not usually accompany the ap-
pearance of written text.8 But allowing for these sorts of perceptual differ-
ences between pictures and language need not undercut the symbolic model.
Remember, the model does not claim that pictures are just like words. It insists
that the syntactic and semantic properties of these systems are quite different,
and there is no reason why such differences should not have perceptual reper-
Vision and Cognition in Picture Perception 183
Notes
* Thanks to Carl Zuckerman for detailed criticism. Versions of this paper were pre-
sented at a memorial conference for Nelson Goodman at Harvard University and at the
Center for Interdisciplinary Research, Bielefeld, Germany.
2. Variations of this criticism are leveled against the model not only by vision theorists,
but by art historians and philosophers who balk at what they take to be the symbolic
paradigm’s conventionalist implications.
3. Much is often made of evidence suggesting that infants (Hochberg and Brooks 1962)
and people from non-Western cultures (Derogowski 1989) can understand pictures
without training. My point here is not to challenge these empirical findings but to call
attention to the need to separate issues of learning and innateness from claims about
the form and conventionality of symbols.
4. For ease of exposition I talk of full spatial readings. It would be more accurate to say
that the visuality of pictorial representation is a function of the degree to which a dis-
play is so interpreted. Mappings typically are not to locations in the ambient space of
the physical picture, nor necessarily to any real world locations.
5. Similar considerations may help distinguish haptic pictures from braille linguistic
symbols.
6. The spatial reading requirement is meant only to capture various intuitions about
the visuality of representational pictures. It is surely not sufficient for distinguishing
“realistic” from “non-realistic” pictures (See Schwartz 1974).
7. I use the Ames chair to highlight problems about the “appropriate” rendering of
scene (b) and the station point assumptions on which it depends. I am not here ques-
tioning the “generic viewpoint” constraint, thought to resolve some cases of ambigu-
ity in ordinary perception (Hoffman, 1998).
186 Picture Perception
8. It might, however, if a pointing finger like Uncle Sam’s were a letter or word in some
language.
Bibliography
Kennedy, J. (1993). Drawing and the Blind. New Haven: Yale University Press.
Rogers, S. (1995). “Perceiving Pictorial Space.” In W. Epstein and S. Rogers (eds.) Percep-
tion of Space and Motion. Boston: Academic Press, 119–63.
———. (1994). Vision: Variations on Some Berkeleian Themes. Oxford: Blackwell Publishers.
Wollheim, R. (1974). On Art and the Mind. Cambridge: Harvard University Press.
IV Missing the Real Point
Prescript 12
Studies of object perception, its origin, and its onset are the focus of much at-
tention in perception and developmental psychology. Impressive experi-
ments have been run and interesting evidence amassed that is thought to
speak to these concerns. Much of the work, however, makes little effort to be
precise or justify the “object” concept employed. Chapter 12 explores per-
plexities that arise from this laxity.
W. V. Quine has a widely cited, formally clear criterion for determining the
ontological commitment of discourse. His notion of an ontological object,
though, is not what perception theorists and developmental psychologists
mean by the term “object.” Quine’s criterion is purposefully unrestricted and
indiscriminate, it countenances everything that is or exists. This “object” no-
tion is far too inclusive for the perceptual issues being studied. For such pur-
poses it seems necessary to distinguish real objects from the ontologically
possible, but psychologically spurious ones. Absent this distinction there is
no well-defined subject matter for research to confront. Narrowing down the
domain of objects to the “real” has its difficulties. The options examined in
this essay face two serious problems: (1) they do not exclude as real objects
things that are thought to be spurious and (2) central claims said to be im-
portant and peculiar to object perception also hold in perceiving spurious
items. When these problems are confronted, recent criticisms of Piaget’s
views on object perception are no longer as telling as they are frequently
thought to be.
Those working in the field of object perception will undoubtedly feel I have
missed the “real” point. The editors of the volume who commissioned the
article responded so to the first draft of this paper. I rewrote it in an effort to
allay their fears and concerns. The quote below, from the editors’ introduc-
tion, is an indication of where matters were left.
190 Missing the Real Point
This chapter generated by far the most debate among the author and editors. . . . Who
wants to be told their focus of study may not be coherent? Like Justice Stewart’s crite-
rion for recognizing obscenity, we all think we know what an object is when we see one.
Yet, as Schwartz suggests, the core notions are not at all simple or settled.
P. Kellman and T. Shipley
12 The Concept of an “Object” in Perception and Cognition
Object recognition . . . is often taken as the primary goal of a visual system. Surpris-
ingly, a significant obstacle in the path of understanding object recognition is that we
lack a precise definition of what constitutes an object. Without such a definition, how
can we possibly know where we are headed? Furthermore, any computational theory
of object recognition becomes impossible, for what is to be computed?1
Whitman Richards
In the theory of vision, object recognition has long been a topic of interest.
For today’s computational theorists it is a core area of study. As Richards indi-
cates, this computational work has brought with it considerable pressure to
find a precise definition of the notion of an “object.” The last number of years
has also witnessed an explosive growth of research in developmental psy-
chology concerning the perception and conception of objects. Beautiful
experiments have been conducted on ever younger infants attempting to de-
termine their earliest awareness or appreciation of objects. The current trend
has been to set the date closer and closer to birth. Many developmentalists
assume, in fact, that the only way to account for the phenomena they have
discovered is to assume the concept of an “object” is innate. But these devel-
opmental claims, like those of vision theorists, would seem to presuppose
some acceptable characterization of objecthood.
What is it then to be an object? In turn, what is it for an organism to perceive
or conceive of something to be one? Finding a precise, computationally sat-
isfactory specification of objecthood has proven to be an elusive task for vi-
sion theorists. For example, even if it is agreed that a car is an object, is its
radio also an object or is it only a part of the car? And what is the status of the
radio’s volume control knob or the left half of this knob? Do they each fall
under the concept “object?” Similarly, consider the car’s fender. Is it a part of
192 Missing the Real Point
an object when attached to the car but an object in its own right when on
the warehouse shelf? And is the dent in the fender itself an object or merely a
feature of one? To raise these questions is enough to see the extent of the
difficulties, and this without pressing for answers to questions about the ob-
jecthood of non-solids (e.g. the gasoline in the tank), or two-dimensional
items (e.g. decal emblems and the car’s shadow), or conglomerations of non-
continuous bits of stuff (e.g. the collection of the car’s tires), or extended con-
tinuous surfaces (e.g. garage walls, the driveway on which the car now rests,
and roads traversed).2
An Answer
In light of the remarks above, it might be thought we lack any plausible ac-
count of what it is for something to be an object. Not so! There is a perfectly
reasonable characterization of objecthood that is as simple and clear as it is
unhelpful. From an ontological point of view, everything that is, is an object.
And as W. V. Quine (1953, 1960) has forcefully argued, all it may mean to
treat something as an object ontologically is to be willing to quantify over it in
discourse. “To be is to be the value of a bound variable” is his motto.
Now there are those who rebel at the idea of granting existence and hence
object status to non-spatial items, assuming that everything that is or exists
must be material and observable. Herein lies the seeds of classic metaphysical
debates over the ontological status of abstract entities, such as numbers and
properties, or mental items, such as dreams or qualia. For our concerns these
controversies can best be ignored. Little will be lost by stipulating that the ob-
jects of perception and conception are all material spatial things.3
Still, this narrowed domain is not what theorists have in mind when they
talk of the visual segmentation of the world into objects or attribute to new-
borns an awareness of objects. For the domain of the spatial includes gerry-
mandered parts and sums as well as temporal segments of the material world.
Ontologically speaking, not only may a chair count as an object, but so can all
of its pieces, from the atomic to the large. In addition, the spatially separated
bits of carpet on which a chair stands, a chair plus a dachshund, or the com-
pound of the tip of a dachshund’s nose for two minutes and a chair for a
moment before, may all be treated as objects of quantification. But if any
assemblage of spatial material, or instantaneous temporal slices thereof, can
be understood to be an object, the computational task of vision remains
The Concept of an “Object” 193
The Problem
Not surprising, count concepts are the ones needed for counting. All count-
ing presupposes a unit to be counted, and for this it is necessary to divide
reference. Mass terms do not provide units, since they themselves do not in-
dividuate among the parts of space-time they describe. Count terms, though,
may denote “spurious” objects as well as “real” ones. “Left half of a radio vol-
ume control knob,” “fender dent,” and terms denoting gerrymandered spatial
or temporal parts of chairs and dachshunds also divide reference, yet the items
they pick out do not meet intuitive criteria for being a “body.” Nevertheless,
there is no problem in principle counting the spurious as well as the real.6
Quine realizes that his own rough and ready characterization of “body” is
vague and its employment context sensitive. His characterization leaves con-
siderable room for differences in interpretation and application, and provides
no theoretical grounds for settling many of the problem cases earlier canvassed.
But for his own purposes, Quine sees little reason to formulate a very precise
definition. It is enough for him that cars, chairs, and dogs are representative
examples of our untutored, everyday notion of an object. They serve as un-
contested touchstones for what in the end is Quine’s challenge to the very idea
of there being any such thing as a referentially fixed, determinate ontology.
ambiguous and problematic. First, the term “body” has both mass and count
uses; the latter individuates, the former does not. It does not divide the world
into entities. Second, words like “car,” “chair,” and “dachshund” apply to bod-
ies as much as the word “body” itself does. Moreover, we may know or know
how to use these words without either having learned the word “body” or
having any other term available meant to acknowledge that a given item falls
into a category whose membership includes all and only bodies per se.
These considerations, in turn, raise questions about the role any explicit
representation of something’s being a “body” might play in object percep-
tion. After all, the count concept “body” is just one way to label or describe re-
gions of space-time, and the need for and specific function of it remains
unclear. Seldom, for example, does the task at hand require determining if or
how many bodies per se are in the offing. More usually the task at hand is to
perceive the kinds and properties of the bodies present. We need to know if
what is in front of us is edible, sit-able, lift-able, weight bearing, alive, prey,
predator, car, chair, or dachshund, not if it is a body per se.
Perhaps if the notion of a “body” actually plays a significant role in object
perception, it is because such a concept is implicitly, rather than explicitly, in-
volved in determining the kinds and properties of things in our environment.
Perceiving cars, predators, edibles, and sit-ables must somehow require or
reflect an appreciation of them as bodies. But how is this claim to be under-
stood empirically? What does it means for the visual system to implicitly take
something as a body?
Various of the problems explored above repeat themselves. Cars, predators,
some edibles and some sit-ables are instances of bodies. It follows, then, that
in perceiving them as such the visual system “marks out the boundaries” of
whatever spatial regions are so described. This is all pretty tautologous. It says
little more than that the processes of the visual system enable perceivers to
discriminate those regions of the material world that contain cars, predators,
edibles, and sit-ables from those that do not. What does not seem to follow is
that in order for the visual system to make these discriminations it must first
determine, represent, or otherwise render the information that there is an in-
stance of the (count) property “body” present. And surely it does not follow
that responding differentially to cars, predators, edibles and sit-ables entails
an appreciation of the fact that the regions so discerned share membership in
the class of bounded items that exhibit continuity of displacement, continu-
ity of visual distortion, and continuity of discoloration.
The Concept of an “Object” 197
To some it may seem that I have changed, misunderstood, or avoided the is-
sue of object perception as they conceive it. For them, the task of object per-
ception concerns the visual system’s encoding space time regions (STR’s) as
bodies and creating/assigning various descriptions or descriptors to them.
Now granted that the level of analysis may be different, I think the issues
raised above more or less carry over to this task specification as well. To ap-
preciate this, make the following terminological replacements:
“encodes STR x as a body” for “perceives that the STR x is a body”
“assigns to STR x the description # or a # descriptor” for “perceives that the
STR x is or is a #”.
Nothing here assumes that descriptions or descriptors are previously es-
tablished categories or that they are restricted to basic level shape categories
or that segmentation cues only operate for familiar shapes. Nor does it pre-
clude that the encodings and assignings are the work of relatively autonomous
perceptual mechanisms. Also note that “functional” property descriptors such
as weight-bearing, sit-able, and (in)edible and various “non-functional” prop-
erty descriptors such as size, shape, texture, color, and composition are ap-
plicable to both bodies and non-bodies (e.g. shadows, fender dents, and
driveways). And, as mentioned earlier, whatever analysis of the layout the vi-
sual system makes available does enable us to describe verbally and respond
appropriately to “non” and “spurious” objects along with the “real.”
That the visual system provides or affords information that guides the way
we navigate, act, and react to the environment goes without saying. The is-
sue, rather, is to understand better the sense in which the visual system must
encode regions as bodies in order to do so. If in the end all the claim amounts
to is that descriptors and descriptions are applied to regions of material space,
there is little to debate. There does, of course, remain much to find out and de-
bate about how the visual system actually accomplishes these tasks.
Developmental Considerations
sit-able, lift-able, and alive may not amount to much either. So it would not
be surprising if infants start out lumping all these different types of things
into one big diffuse category, that of “bodies” (Shipley & Shepperson 1990).
But again, there are problems understanding the exact content of such a
claim. For example, it is generally agreed that quite early on in life infants can
separate figure from ground and can distinguish spatially continuous bits of
matter from disconnected pieces of stuff. They also respond differently to
portions of their environment that move together and those that do not. And
their appreciation of the layout, such as it is, can guide their activities. Does
this mean, though, that they have and make use of a label (concept or repre-
sentation) that serves to connote or denote all the things that are bodies or
have the property “body?” If not, does it at least mean that their visual system
makes implicit use of such a representation in the course of providing the in-
fant information about the environment? Again, and for reasons similar to
those canvassed in the previous section, I am not convinced a positive answer
to either of these question is logically or empirically required.
Now some theorists seem content to let the evidence speak for itself. They
are willing simply to call instances of figure/ground discrimination, Gestalt
grouping phenomena, or perceptual tracking activities, whether by infants or
adults, instances of or proof of object perception. And I have no qualms with
this practice, as long as the nature and extent of the claims are kept in mind.
For other theorists, perceptual discrimination, grouping, and tracking, are
not taken to be sufficient for the attribution of object perception. The infant or
adult must in some sense be cognizant of or represent the space-time regions
isolated, grouped, or tracked as bodies. For them finding a satisfactory un-
derstanding or characterization of this richer demand remains an issue.
Bodies have both spatial and temporal dimensions. Cars, dachshunds, edibles,
and sit-ables not only occupy areas of space, they also have settled pasts and
futures rife with threats and promises. Discriminating among regions of space
that are so described, however, neither requires nor presupposes having
knowledge of such life histories and prospects. It is one thing to be able to per-
ceive correctly a wide variety of cars, dachshunds, edibles, and sit-ables under
ideal and less than ideal conditions. It is another to have perceptual con-
stancy, to appreciate the sameness of particular shapes, sizes, textures, and
The Concept of an “Object” 199
colors when viewed under variable lighting and from different angles and
distances. It is another to be able to perceive these shapes, sizes, textures, and
colors when parts of the regions are occluded from view. And it is another still
to have a firm grip or conviction about how things will be and look at much
later dates.
Cars, dachshunds, edibles, sit-ables, indeed material substances in general,
change in shape, size, color, coherence, and consistency as they age and in-
teract with the world. Some of the changes can be reliably predicted, many
cannot, and the best scientifically sanctioned predictions will not always
turn out as expected. Given the vagaries of life histories, we are thus much
more likely to be accurate about how a currently observable temporal slice of
our environment might look under certain different viewing conditions than
about how future temporal slices will appear. Nonetheless, perhaps the most
basic, general, and reliable prediction we make about our environment is that
things do not change or go out of existence without cause. In addition, we as-
sume that neither mere spatial displacement nor our observing and failing to
observe the world are causes of physical change or annihilation.
Appreciation of persistence over time, independent of displacement and
observation is at times referred to as “object permanence.” The term can be
somewhat misleading in that regions to which mass terms apply (e.g. red or
water containing places), “non” objects (e.g. shadows, fender dents, or drive-
ways) and “spurious” objects, likewise, do not change or go out of existence
without cause. And they, too, are presumed by us to carry on their lives inde-
pendent of our observation. So there is nothing special about the domain
of bodies or “real” objects on this score. Undeniably, temporal slices of real,
non, and spurious objects do go out of existence when their time is up and
they are no longer observed, but this is by definition not by cause. Be that as
it may, an appreciation of such persistence over time is what many theorists
mean by perceiving or conceiving of the world as composed of objects.
Piaget argued that an infant’s conception of reality is much different from our
own. The newborn does not distinguish the world of experience from experi-
ence of the world. Conception of a world with enduring material objects
existing independently of oneself comes later and requires construction.
In addition, Piaget claimed that an infant’s concept of reality is initially
200 Missing the Real Point
constructed in terms of his or her own actions and the immediate environ-
mental effects or reactions they precipitate.
To support these contentions Piaget devised a variety of ingenious experi-
ments intended to show that infants’ responses to spatial and temporal trans-
formations are not at all like those of older children and adults. Initially, Piaget
contends, babies do not expect objects to persist over time and place. Hence,
they do not, as we do, search persistently for hidden objects, nor do they have
the same expectations about what happens when things move behind and
emerge from occlusions. For newborns, out of sight is not only out of mind,
it is out of existence. Or put more accurately, newborns do not have a sub-
stantial conception of existence in and for which these distinctions make
good sense. Eventually infants do begin to search intentionally for missing
objects, but the searches are guided more by past patterns of interaction than
by the available evidence. Infants expect to find an object at the place they
found it before, rather than where they have just observed it being placed.
Piaget’s pioneering work and theories set the stage for much contemporary
discussion of the development of object perception and conception. A spate
of recent experiments claim to demonstrate that Piaget may have under-
estimated young babies’ prowess. Infants, it is maintained, do seem surprised
when things hidden behind a screen are not there when the obstructing
screen is removed. They seem to share with us some biases about the paths
moving things will take, and they have some expectations about the full con-
tours of simple shapes whose parts are occluded from view. In addition, their
searches are not guided solely by past success but may take into account new
conflicting evidence.
Now I have no desire to criticize these experiments, although issues of de-
sign and data have been raised (Haith and Benson 1998). My concern is how
best to understand their implications and import. Earlier I raised questions
about the proper interpretation and role a notion of a body per se might play
in conceptual activities or in the encoding activities of the visual system.
These questions and qualms, of course, do not preclude our having expecta-
tions. And I am willing to accept that the recent experimental evidence sug-
gests infants may have richer sets of expectations at an earlier age than many,
including Piaget, may have thought. Less clear is what these findings say
about the perception and conception of objects.
In The Child’s Conception of Reality, for example, Piaget allows that infants
may have crude expectations of constancy, occlusion completion, and per-
The Concept of an “Object” 201
sistence that they use to accommodate their experiences. They may briefly
search for the hidden, be surprised when something disappears without
cause, and have wired-in visual pursuit schema for tracking movement. What
Piaget denies is that these expectations and perceptual strategies extend
much beyond the here and now. According to Piaget, infants do not have cog-
nitively useful representations of the structures and patterns of events in the
environment that enable them to place items in our ordinary “scheme of
things”—a stable world with its own independent past and future. But Piaget
argues, an appreciation of permanence and persistence restricted largely to
the here and now is not sufficient for the attribution of the object concept.
A more enduring spatial/temporal framework is required (See Sugarman
1987). Is Piaget correct, though, about the real nature of objecthood, and are
his more demanding criteria for attributing object perception and concep-
tion warranted?
Knowing how things can or will behave in and out of our presence makes
up a large part of what we each know about the world. Some of this knowl-
edge may be genetically inherited, some is readily attainable and common-
place (e.g. dropped objects tend to fall), most comes only with a good deal of
experience or learning, and much remains exclusively within the purview of
scientists or other experts (e.g. an accurate theoretical conception of space
and time). Moreover, there is no plausible bound on what there is to know
(what correct expectations we may have) about the possible or actual behav-
ior of the animate or inanimate world and the events that can or will take
place. It goes without saying that an infant’s conception and understanding
of the world is different from and impoverished compared to our own.
When, though, in the course of this development do infants first appreciate
a world of objects? At what age or stage does a child first perceive or conceives
of things as bodies? The analysis found in these pages suggests that this ques-
tion may not be clear enough to answer or answer univocally. For neither
everyday practice nor current psychological theory seems in a position to
sanction a single privileged way to understand the claim. Furthermore, I am
not sure what is at stake in settling on one. Is there, for example, a substantive
difference between the claim that at a particular age infants do not perceive
and conceive of objects and the alternative claim that infants do have an ap-
preciation of objecthood, only their expectations and biases about the course
of events are quite different from our own? But surely if infants’ expectations
and biases (or lack thereof) are radically different from our own, they can not
202 Missing the Real Point
be said to have our concept of an object. But what specifically is “our” concept
of an object, and what role does it play in perception and conception?
Some of the controversy over object perception and conception is, I think,
the result of conflating issues of constancy, permanence, and expectations
with claims of identity. To determine that various space-time regions are
or are segments of the “constantly” same/identical car, dachshund, edible, or
sit-able is distinct from being able to appreciate the constancy of their sizes,
shapes, textures, and even material compositions. Nor does it amount to hav-
ing expectations about how such regions will look from other vantage points
or when occlusions are removed. Judgments of identity require a determina-
tion of where a particular car, dachshund, edible or sit-able starts up and
where it ends off. Identity involves linking segments, not merely describing
them. It is a judgment that a space-time region here and there, before and
now, go together in ways appropriate to sustain a claim that it is the very same
car, dachshund, edible, or sit-able with which we are dealing.
Identity judgments of this sort often do assume sameness of body or bodily
stuff in that in most contexts spatial-temporal regions are usually not said to
constitute segments of the very same, car, dachshund, edible or sit-able un-
less their material makeup traces a more or less continuous path. But obvi-
ously the reverse does not hold. A set of space-time regions may continue as
the same body or bodily stuff but lose its kind-identity. The same body is no
longer identifiable as a car when compressed into a lump of metal at the junk
yard. And even if this lump is then reconstituted as a car, the resulting vehicle
is unlikely to be considered the identical car as its pre-crushed embodiment.
Similarly, being shown the pre and post crushed cars, but unaware of their his-
tory of transition, one may readily declare each such space-time regions to
the bodies, i.e. segregated, bounded matter, perhaps of a particular size, shape,
texture and composition. Yet one may have no idea that these different look-
ing manifestations are actually segments of a single continuous lump of metal.
And confuted expectations of persistence of size, shape, texture and color
may be a main reason for the mistake.
Thus judgments of identity run deep. Appreciation of change whether ex-
pected or unexpected, entails neither a claim of identity nor one of non-
identity. A cake cut into slices can for some purposes be considered the same
The Concept of an “Object” 203
cake, although the transformation into segments may not have been observed
and the resulting spatial array unanticipated. In contrast, an identically look-
ing intact substitute confection is not the same cake, although there may be
no visually apparent differences to be discerned. Surprise at finding many
pieces of cake when a screen is lifted, is compatible with judging the now
non-contiguous pieces to constitute a stage in the life of the one cake hidden.
The space-time regions before and after hiding count as segments of the same
cake, relative to one way the term “cake” may be wielded to divide reference.
The situation is similar with the concept “body.” Surprise at finding a dis-
tribution of matter not of the shape, color, or cohesion expected is compat-
ible with a judgment of the identity of the constituting materials. Likewise,
failure to notice any difference in bodily appearance between space-time re-
gions is compatible with a denial that the regions so observed are parts of one
and the same body. Body-identity is to be understood in terms of an evalua-
tion of identity over space and time with respect to some particular individu-
ating notion of a “body.”
Identity, then, is a more abstract notion than phenomenal or physical in-
distinguishability. And for the most part, we get along on vague if plausible
intuitions of sameness or difference of identity adjusted to context, salience,
and need. If pressed for something firmer or fixed, we usually soon find out
we have great difficulty coming up with criteria of identity in anything that
approaches necessary and sufficient conditions. For example, is the car at
hand, the same old car totally refurbished, or is what exists a new car, given
that all the original parts have been changed? And would or should it matter
if the (re)building took place in a day, not over a decade? Alternatively, might
it even make sense to think that the pre and post crushed cars previously
alluded to are really temporal slices of one “transformed” car, since all the
materials are the same? Centuries of philosophical puzzles about personal
identity, the identity of a ship completely rebuilt one plank at a time, the
metamorphosis of butterflies, and a mind-boggling array of cases of object
fission or fusion serve as further warning of the problems to be faced.
Another source of confusion in discussions of object perception and con-
ception is the failure to keep in mind a distinction between two different
kinds of identity judgments. The claim that a space/time region a and an-
other space/time region b belong to one and the same car, dachshund, edible,
sit-able, or body per se differs from the claim that a = b. The former says that
a and b are parts of the same whole, relative to some way of individuating
204 Missing the Real Point
which wholes are to count. The latter says that the space/time region picked
out by a and that picked out by b is the very same one. Thus, consider the
much cited identity: the Evening Star = the Morning Star = Venus. This iden-
tity is not to be understood as a claim that certain evening spatial/temporal
slices of the heavens and certain morning spatial/temporal slices are parts of
the planet Venus. Instead, the claim is that the entity picked out by each of
the three expressions is the exact same totality. Our use of “star” and “planet”
to individuate and divide reference may play a role in fixing the reference of
these labels, but the identity itself is not relative to either concept. Numerical
identity is not identity with respect to an individuating category. In general,
x = y, if and only, the objects referred to by the names, variables, or other sin-
gular terms are identical.
Neither part/whole nor numerical identity, however, simply inhere in Na-
ture and the course of events. Quine, indeed, questions wherein the empiri-
cal content of identity claims is to be found other than in our use of general
terms to divide reference and singular terms to name. For Quine, reification
or commitment to a world of objects amounts to no more. It also demands no
less, since what makes an entity the entity it is, is its identity. The linguistic fo-
cus of Quine’s account of ontology and ontological commitment lies in this
understanding of identity and reification.
Quine’s more radical and controversial ideas lie elsewhere. They have to do
with his views about language and about how language hooks up with the
world. Quine maintains that there are incompatible ways to assign meanings
and denotations to the terms of our language and no fact of the matter as to
which among a set of observationally adequate assignments is correct. There-
fore, ontology and attributions of ontology are themselves parochial, relative
to the scheme of translation adopted (Quine 1960, 1969). Now this is no place
to explicate, let alone defend, Quine’s theses of indeterminacy of translation,
the inscrutability of reference, and the implications they both have for his
doctrine of ontological relativity. Suffice it to say Quine’s ontological notion
of perceiving and conceiving of objects is more abstract and linguistically fo-
cused than that of most psychologists, including Piaget.
Conclusion
My goal in this paper has not been polemical. I have attempted to sort out a
number of theoretical issues central to discussions of the perception and con-
The Concept of an “Object” 205
of an “object” makes its first appearance, founders on the fact that there is no
unique object concept sanctioned either by ordinary use or present scien-
tific theory.
3. Differential responses and manifestations of expectations met or frus-
trated are important tools for studying perception and conception. Nothing
said in this paper is meant to decry or challenge their usefulness. But they can
only take us so far. When it comes to richer, more abstract notions of “object,”
“identity,” and “reification,” whether those of Piaget, Quine, or those cham-
pioned by other theorists, they may not be able to take us far enough.
Acknowledgments
I wish to thank Sidney Morgenbesser, David Rosenthal, and the editors of this
volume for comments and helpful criticism.
Notes
2. Related problems are involved in attempts to specify formally the notion “object
part.” It is not possible in this paper to discuss explicitly the complications this issue
raises. For a non-technical account of the idea of “object part” in theories of vision, see
Hoffman (1998).
4. In fact, in one sense of the word “see,” at any given moment we can only see a tem-
poral slice of an object.
5. In some of the literature the supposed real objects are said to be “units” or “things”
or “wholes.” Whatever the difference in terminology, the problems to be considered
remain much the same.
6. The notions “object files” and “object file counters” have gained some prominence
in recent work in perception and cognition. (See, for example, Scholl and Leslie 1999)
Space limitations prevent my giving this work the specific, in-depth treatment it de-
serves. It us enough to note that this approach does not abnegate the need for schemes
to divide reference, rather it is to be understood as a proposal about what the scheme
and units may be in some cases. There is a vast and growing body of research purport-
ing to show that very young infants can count. Elsewhere, I have expressed reser-
The Concept of an “Object” 207
vations about the claim that these studies demonstrate that the concept of number
is innate (Schwartz 1995). Accumulating evidence also seems to indicate that much of
the experimental data on infant “number” behavior may be explained in terms of in-
fants having an appreciation of amounts (e.g. area or volume) rather than an apprecia-
tion of cardinality (Mix et al., 2002). This is significant for our concerns in that such
judgments of sameness and difference of amounts may presuppose only a rudimentary
mastery of mass terms or concepts rather than a need for count categories to divide ref-
erence. (Schwartz 1999). In any case, it should be clear that full-fledged counting,
whether counting cars, chairs, and dachshunds, or simply bodies (i.e. objects) does re-
quire count labels or concepts to provide units.
References
Mix, K., J. Huttenlocher, and S. Levine, (2002). Quantitative Development in Infancy and
Early Childhood. Oxford: Oxford University Press.
Piaget, J. (1954). The Construction of Reality in The Child. New York: Ballantine Books.
Quine, W. V. (1953). From a Logical Point of View. Cambridge: Harvard University Press.
———. (1969). Ontological Relativity and other Essays. New York: Columbia Univer-
sity Press.
Scholl, B. and A. M. Leslie (1999). “Explaining the infant’s object concept: Beyond the
perception/cognition dichotomy.” In E. Lepore and Z. Pylyshyn (editors), What is Cog-
nitive Science? Oxford: Blackwell.
208 Missing the Real Point
———. (1999). “Counts, amounts and quantities,” paper presented at Society for Re-
search and Child Development. Albuquerque, New Mexico.
This essay was published with a preface and with commentaries by Alan
Gilchrist, Paul Whittle, and Richard Brown. They and I were members of a
project on perception organized and underwritten by the Center for Interdis-
ciplinary Research (ZiF) at the University of Bielefeld. The preface describes
the origins of the work and provides context for its particular focus and line
of argument. The underlying issues and debates come up over and over again,
in other articles and commentaries in the volume from which chapter 13
is taken. (See especially R. Mausfeld, “The Dual Coding of Colour” and re-
sponses in R. Mausfeld and D. Heyer (eds.) (2003) Colour Perception: Connect-
ing the Mind to the World. Oxford: Oxford University, 381–486.)
13 Avoiding Errors About Error
Preface
This study began in collaboration with Alan Gilchrist. Alan was working on a
book on lightness perception. He was developing a new model, one based, in
no small part, on a notion of “error.” Alan’s project, however, met resistance
from various visual scientists in the ZiF group. A major reason was their un-
willingness to countenance Alan’s appeal to error. Indeed, many maintained
there could be no such thing as error, at least not when it came to perceiving
color. On the face of it, this criticism was puzzling. No one doubted, for ex-
ample, that on occasion we mistakenly put on socks that do not match. More-
over, often those who recoiled at the notion of error were content to talk
about vision being “veridical.” In an effort to clarify issues, Alan and I decided
to write a joint paper on error. We would spell out a sound psychophysical
concept of error and untangle assorted confusions plaguing the group’s dis-
cussions. Our collaboration began with my proposing alternative ways to
specify a precise notion of error and Alan challenging the suitability of my
formulations. In the end, none of the options I offered met with Alan’s ap-
proval, and our joint enterprise was abandoned. I, then, pursued the topic
on my own.
My aim was neither to put forth nor defend any particular account of er-
ror. Instead, I wished to delineate the space of options available and char-
acterize, in a very general way, the advantages and difficulties facing each.
I came to believe, in fact, that there was room in the study of both achro-
matic and chromatic color for alternative accounts of error, each perhaps
useful in different contexts and for different tasks. I became convinced, how-
ever, that my proposed rapprochement was being thwarted by unexpressed
212 Missing the Real Point
Introduction
That we make errors in perception seems all too obvious. Less obvious is that
we are often mistaken about the nature of perceptual error. A major reason
for this latter confusion is failure to pay proper attention to the fact that er-
ror is a relative matter—relative to an understanding or specification of
what it is to get things right. Independent of a standard of correctness,
claims of error are otiose. This chapter focuses on accounts of error in the per-
ception of achromatic colors, that is, the perception of white, black, and the
grays. These “colors” are said to lack hue; they constitute what is known as
the “gray-scale.”1
As investigation will show, the idea of perceptual error is often understood
in different and conflicting ways, and there is no reason to assume that one
account is privileged. Moreover, there is reason to treat various purported
cases of error not as error, but as discordances among competing ways of or-
ganizing and ordering our world. Until near the end of this chapter, such
qualms will be kept in abeyance. If along the way use of the term “error” jars
intuitions, consider it a technical term of service in psychophysics. This may
not be far from the position it is best to adopt, in any case.
Avoiding Errors About Error 213
Terminology
Not all light striking the surface of an object is reflected. Black surfaces reflect
very little, white surfaces almost all, and gray ones, varying amounts in be-
tween. The ratio of reflected light to the incident light is called “reflectance.”
“Lightness” and “lightness perception” are the terms used to talk about the
experiential correlates of surface reflectance, our experience of the gray scale.
Lightness constancy is the ability to perceive a surface has the same lightness
when viewed under different conditions. (For technical details, see Wyszecki
and Stiles 1982, and the glossary of Gilchrist 1994.)
Anyone perusing an introductory psychology text will probably run into a
demonstration of a popular illusion in achromatic colour perception. This
“simultaneous contrast illusion,” as it is called, is easy to duplicate on one’s
own. Take two small squares of paper of the exact same shade of gray, place
one on a black background and the other on a white background. Under these
conditions the squares do not look alike. The square on the black background
appears lighter than the one on the white surround. Thus our perception of
lightness is said to be in error. Lightness constancy fails. Two objects of physi-
cally identical material do not look the same; they do not match perceptually.
Matching tasks are the preferred method for studying errors in lightness con-
stancy. A standard paradigm is to have a subject select or adjust the reflectance
of a surface viewed in good light so as to match a given target surface. The tar-
get may be viewed in shadow, against a special background, or under some
other condition of experimental interest. The subject’s matching judgments
are then compared with the physical reflectance properties of the surfaces
(for details and variations on the paradigm, see Wyszecki and Stiles 1982).
To simplify discussion of the logic of these studies and the ideas of ‘error’
employed, it will be helpful to introduce some notational abbreviations:
(6) Cix = Ciy: if, and only if, the subject judges the them to be the same or to
match perceptually.2
Reflectance errors
Errors of look
everyday conversation it is assumed that things do not look the way they re-
ally are when the lighting is very dim.
Although backed in this way by intuitions, the idea that achromatic colors
sometimes appear right, and at other times wrong, needs careful explication.
As with all notions of error, to make sense of L-error we must specify an ap-
propriate standard of correctness. With respect to what is an appearance to be
judged incorrect? How are we to understand the claim that something does or
does not appear with its appropriate lightness? What is it for an object to look
to have its true value, to be perceived as it should be? Until these questions are
answered, common intuitions about errors of look lack firm foundations.
One obvious way to settle such matters is to specify that the correct or
“right” look for a surface is the way it appears when viewed under some ideal
condition, CI. There is L-error, then, whenever a target surface looks different
from how it does in this special set-up. Cix looks right, if x = y and Cix = CIy.
Alternatively, Cix is an L-error, if x = y and Cix ≠ CIy.
This account of L-error can be used to support those intuitions and distinc-
tions not handled within the conceptual confines of R-error. Suppose, for
example, the assumed ideal viewing condition is the one specified in the
Munsell book, that is, CM = CI. Perception of a surface under this condition
defines its correct look.7 Previously, when x ≠ y and Cix = CMy, there was no
established basis for assigning blame for the R-error. Now, relative to the
choice of CM as standard, there is a justification for pinning the mistake on
one appearance rather than the other. Cix is an L-error.
Choosing a standard also gives purchase on cases where neither of the
samples is under the ideal condition. If x = y, Cix ≠ Cjy and neither Ci nor Cj
are CI, it still is possible to pin the error on one of the perceptions. L-error lies
with the appearance that fails to match the perception of its target reflectance
under CI. If both Cix and Cjy fail to match the perception of the given re-
flectance value under CI, then there is an L-error in both, and the R-error is
due to each.
Intuitions about the true look of a particular target reflectance are given
similar treatment. A surface in shadow does not appear as it should, since its
appearance does not match the way it looks under CI. A target in very bright
light appears lighter than it really is, since it appears lighter than it would
in CI. Or what amounts to the same thing, it matches a target of higher re-
flectance viewed under CI. The accidental success in reflectance judgments
218 Missing the Real Point
Some complications
very idea a target has a singular, true look. An interesting, little explored,
approach to these kinds of puzzles is to distinguish perceptual matching
(our =) from perceptual identity. Matching is non-transitive, while identity is
transitive. For CIx to be phenomenally identical with CIy, it is not enough
they match each other. They must each match everything the other does
(Goodman 1951; Clark 1993).
Adopting this analysis of look identity has some nice advantages. It enables
construction of an ordering of perceptual lightness based solely on judg-
ments of matching. Subjects are not required to provide explicit ordering
judgments. It would, however, complicate analysis of L-error to trace out the
implications of employing this account of appearance identity, and I will not
pursue the issue here (see Schwartz 1996). More pressing problems lie ahead.
Solipsism
Suppose x ≠ y, the difference is quite small, and S judges CIx = CIy. Once more
there is R-error with no L-error. Altering the definitions of “look identity” and
“correct look,” though, does not seem the only or easiest way to avert this
anomaly. Weakening the demand giving rise to R-error would seem a simpler
solution. Stipulate that R-error occurs only when the reflectance difference
exceeds a specified range. If the difference between the targets is less than
the threshold, there is no R-error, and hence no need to appeal to L-error to
explain the mistake.
This sort of response can only be taken so far. The problem is the current
definition of L-error is “solipsistic.” The notion of correct look is specified
solely with respect to judgments of how things look to an individual subject
under CI. And this individualistic conception of looking right leads to trouble.
For suppose two surfaces differ enough in reflectance so that under ideal con-
ditions they are easily discriminated by the average perceiver. If S cannot tell
such targets apart under CI, it seems clear S makes an error. But an error of
what kind? There is no problem attributing R-error; S fails to discriminate be-
tween reflectance differences beyond the allowable range. S’s R-error, never-
theless, cannot be attributed to L-error, since it occurs under CI.
Were the deficiencies with S’s judgments confined to small threshold-type
cases, the failure of L-error to underpin R-error might not be very bother-
some. Unfortunately, the issue runs deeper. For all intents and purposes,
S could be “lightness blind.” Under ideal conditions, S might perceive most
220 Missing the Real Point
Abandoning looks?
Does the case of lightness deficiency mean that the notion of a correct look
should be abandoned, and with it the idea of L-type error? Right off, that
would seem an overly hasty conclusion. If claims are limited to normal per-
ceivers, it might still be possible to say something useful about errors of ap-
pearance. The definition of L-error need not be changed, only its application
is restricted to persons with non-defective vision. The correct look of a surface
for a normal subject, S, is the look it gives S under CI. Correct-look and L-error
remain individualistic notions, that is, specified relative to a given perceiver.
Although, again, there is no need to assume it is possible to determine whether
the subjective experiences of different people are subjectively identical. The
restriction to normal perceivers merely serves to avoid the difficulties posed
by the lightness deficient. It is not meant to resolve, or to depend on, resolu-
tion of inverted spectrum type quandaries.
Avoiding Errors About Error 221
The initial limitation to the normal sighted does not preclude attributing
some errors of appearance to those with defective vision. Many of the judg-
ments of a lightness deficient perceiver, S, will be R-errors with respect to the
standards set for normal persons. S has matching perceptions where normal per-
ceivers experience the targets as non-matching. In these cases, it may seem rea-
sonable to make the minimal claim that S’s appearances can not both be right.
Then again, it is not clear what is gained by extending the notion of L-error
to the lightness deficient. It is, after all, the pattern of R-errors that is relied on
to determine if a subject’s lightness perception is defective. And since the no-
tion of R-error is thoroughly general, it can be used to explore S’s achromatic
color constancy for any pair of reflectances, under any set of conditions. It
might seem possible, then, to say most everything worth saying of the defi-
cient perceiver’s visual competence without appeal to the more troublesome
idea of an L-error.
Such considerations, in fact, raise questions about the importance of hav-
ing a notion like L-error on hand. For what was just said about the lightness,
deficient holds largely for normal perceivers. Before S can be certified to be a
normal perceiver, S’s R-errors must be examined. But once we have mapped
out S’s successes and failures in matching reflectances, is there really a need
for the concept of L-error in the study of perceptual constancy?
The prospect of not having to deal with L-error and the question, “How do
things look to subjects?,” will strike many as a welcome relief. By so doing,
psychophysics is nicely externalized, if not behavioralized. On one side, there
is lightness difference defined solely in terms of physical reflectance. On the
other side, there are people’s overt judgments of matching. Nowhere does
concern about the qualitative aspects of subjective experience obtrude.
The problem with abandoning “looks talk,” however, is that along with
gains in simplicity and methodological purity there are seeming losses. Re-
call the felt need to say something richer about S’s perceptual experience in
order to pin down the source of R-errors, or to determine whether the target
appears as it really is, or to indicate when S’s matching judgments were right
by accident. Setting standards for both CI and normal vision appeared to pro-
vide the wherewithal to account for many of these aspects of achromatic
colour constancy.
Nevertheless, talk of how things look to individual perceivers seems to in-
troduce an additional subjective element into the study of lightness. And the
222 Missing the Real Point
Reliable methods
Ideal conditions
Until now the assumption that the Munsell condition, CM, may be an ideal
condition for perception has gone unexamined. Justification for this claim
needs further examination, for the notion of an ideal viewing condition is not
all that clear. The simplest explication might seem to be in terms of reliable
methods and R-error. A condition is ideal if it is optimally reliable for lightness
discrimination. There is no other condition under which normal perceivers
make fewer R-errors. So understood, optimal reliability depends on the cho-
sen allowable threshold for R-error. For example, two different conditions
may both satisfy the criterion when the range for error is x ± n, but when the
range is narrowed to x, only one may meet the specification. To deal with this
possibility, it might be preferable to define “optimal” in terms of yielding the
fewest R-errors within the narrowest appropriate reflectance range.
The situation, though, could be more complicated. One condition may
lead to fewer errors when x ± n is the allowed range, while resulting in more
error when the range is narrowed to x. At the same time, the error rate for both
methods could be considerably higher than it is with the wider range x ± n.
There are trade-offs between error reduction and precision. Thus there may
not be a unique characterization of optimality, and there may be more than a
single condition meeting any optimality standard adopted (see Helson 1943).
Justification of a particular viewing condition as ideal, depends, therefore,
both on the criterion of optimality selected and on empirical findings about
how well the condition fares in competition with other viewing conditions.
And no condition may be unique in meeting these demands. Leaving final
resolution of these matters aside, is it reasonable to assume the Munsell con-
dition will qualify? One problem with this assumption is that lightness dis-
crimination is thought to be somewhat better when the illumination is
higher than it is under the Munsell condition. And this possible flaw with CM
raises an interesting question about the policy of identifying ideal conditions
with those optimal for lightness discrimination.
Discrimination might turn out to be best when the level of illumination is
well beyond that ordinarily encountered in daylight or in typical artificial
light. Or R-error could be least when targets are viewed in some specially pre-
pared non-white light or against a specially prepared background. Were this
the case, the optimal and hence ideal condition would be a condition seldom,
if ever, found in everyday perceptual tasks.
224 Missing the Real Point
Standards
Is R-error error?11
tion of the world, or that the physicist’s analysis of achromatic color is onto-
logically privileged. The notion or notions of achromatic color needed for
physics may differ from those that best serve the needs of psychophysics or
optometry. These, in turn, may be different from those most suited to meet
the requirements of a carpet manufacturer, a lighting expert, or a museum re-
storer. Such concepts will flourish or fade on the basis of the work they do in
the areas they were designed to serve. The most the physicist, engineer, or de-
sign specialist can do is develop useful ways for categorizing the varied phe-
nomena of achromatic color that prove to be of intellectual or practical
interest. What else could or should be expected?
Claims that only one account of achromatic color can capture its essential
nature and specify what black, white, and gray really are, hinge largely on
preferred philosophical doctrines of essences and reality, rather than on sub-
stantive empirical considerations concerning perception. However, these
doctrines have no priority or pride of place in telling us what Is or is not Real.
Nor do they provide a higher or superior vantage point to rule on the number
or adequacy of alternative conceptions of our world. Indeed, if such philo-
sophical theories occupy any place, it will only be that of another kind of en-
quiry, epistemological or metaphysical, with its own constraints, interests,
and focus (see Schwartz 2000).
Conclusion
Acknowledgments
Notes
1. Although limited to the achromatic case, I believe the analysis has implications for
the study of chromatic colours as well.
2. The symbol = is used throughout not for numerical identity, but for sameness of
stimuli, conditions, or experiences, as understood in studies of lightness perception.
3. In Gilchrist et al. (1999) the notion of error is not general but is relative to Munsell
viewing conditions. I discuss this matter below.
Avoiding Errors About Error 229
5. The Munsell book, a widely used reference work, provides color samples organized
according to a well-specified system of color ordering. (For a discussion of the Munsell
system and others, see Wyszecki and Stiles 1982.)
6. This claim cannot be taken to mean targets in bright illumination match surfaces
with higher reflectance than themselves. Sometimes they will; sometimes they will
not. An x in bright illumination will match a y of lower reflectance, if y is in even
brighter illumination or if y is displayed against an appreciably darker background.
8. It is, at times, assumed that the chart of Munsell chips serves as a measuring device,
on analogy with the use of the standard meter stick to measure length. Exploring the
pros and cons of this analogy requires more attention than the matter can be given here.
10. As it is, the simple one-dimensional account of gray-scale experience is the re-
sult of a certain amount of abstraction. If gray-scale phenomena are treated more like
other colors, and in matching tests chromatic near-gray surfaces or colored lights
are used, the picture of what is involved in achromatic judgment and error might be
quite different.
11. The positions and arguments merely sketched in this section are developed more
fully in Schwartz (1996).
References
Byrne, A. and D. R. Hilbert (eds.) (1997). “The Philosophy of color.” In Readings on color.
Vol. 1. Cambridge, MA: MIT Press.
Gilchrist, A., C. Kossyfidis, F. Bonato, T. Agnostini, J. Cataliotti, X. Li, et al. (1995). A new
theory of lightness perception (unpublished).
———. (1999). “An anchoring theory of lightness perception.” Psychological Review 106,
795–834.
Helson, H. (1943). “Some factors and implications of color constancy.” Journal of the
Optical Society of America 33, 555–567.
Munsell Color Company (1976). Munsell book of color. Baltimore: Munsell Color.
Schwartz, R. (2000). “Starting from scratch: Making worlds.” Erkenntnis 52, 151–159.
Wyszecki, G. and W. S. Stiles (1982). Color science: Concepts and methods, quantitative
data and formulae, 2nd edn. New York: Wiley.
Prescript: 14
In the spirit of pluralism, this essay argues the need for both phenomenalist
and physicalist accounts of color. It also questions the significance of claims
that one version is epistemologically primary, conceptually constitutive, or
ontologically more basic. Limiting the analysis to achromatic color, here as in
chapter 13, has advantages. It avoids complexities of the optics, physiology,
and psychology of chromatic color phenomena. A disadvantage is that in
avoiding these complexities, it can make problems concerning color seem
more tractable than they actually are. Similarly, dividing theories of (achro-
matic) color into two broad classes, phenomenalist and physicalist, allows for
simplification in presentation and argument, but it, too, can distort. Reliance
on this dichotomy is not meant to suggest that there is a sharp, well under-
stood line of demarcation separating these rough and ready umbrella cate-
gories. Nor is it meant to suggest that one is needed.
It is surprising to hear claims about the physical nature of color, as if there is
a single concept of real color studied in the natural sciences. The assumption
of a unique core conception of phenomenal color is more dubious. Color talk
serves different purposes in physics, chemistry, biology, and engineering. It
speaks to still other concerns in studies of art, color blindness, interior deco-
ration, the manufacture of paint, and psycho-physical color orderings. The
idea that all these uses can be reduced to or shown to supervene on one privi-
leged conception of color is more wishful thinking than justified supposition.
Alternative conceptions of color are legitimate, and objective theoretical and
empirical practices have grown up around their employment.
14 Pluralist Perspectives on Perceptual Error*
appear to be the same color. To the dismay of the experimenter she answers
“Yes and no. The two chips look the same, so, yes, they have the same color
appearance, but taking into account the differences in backgrounds, they
must be coated with paints of different reflectance.” Although Gwen’s seem-
ingly contradictory yes and no reply is readily understood, her answer is not
quite what the psychophysicist is looking for. The problem is Gwen’s percep-
tual experience is assumed to be in error, yet her perceptual judgments each
in their own way seem correct.
In order to force the issue the experimenter rephrases the instructions.
Gwen is asked if the chips perceptually match and is told to respond simply
yes or no. She says “Yes.” Now the psychophisicist feels better placed to accuse
Gwen of making perceptual error. Gwen said the chips match, but they are
each covered with paints of non-identical reflectance. Notified of her error,
however, Gwen expresses surprise. “Sure, the paints have different reflectance,
I said that before. All I have claimed is that under the conditions of presenta-
tion (a) and (b) have the same appearance. So where is my error, where have I
gone wrong?”
At this stage, it is hard to tell who is more frustrated, subject or experi-
menter. In any case, the test is run one last time. Gwen is instructed to tell if
the chips present the same real color. To the psychophisicist’s chagrin Gwen
replies, “Well, yes and no. They really do appear the same, so they have the
same color appearance. Yet they must be covered with paints of different re-
flectance, so their physical colors are not really identical.”
It should be obvious the dialogue between the cagey subject and the caged
experimenter is going nowhere. As long as Gwen does not claim the chips
have the same reflectance or something similar, she has said nothing false
about the physical layout. She would, of course, have made a mistake if on the
basis of the matching appearances she claimed the chips are covered with
paint of identical reflectance. But likewise, if on the basis of her belief about
this difference in paint pigment, Gwen predicted (a) and (b) will look differ-
ent under the experimental setup, she would also have been mistaken. This
time her error would be with respect to appearance, not reflectance. What’s
more, errors of either sort can have disastrous consequences. The painting con-
tractor, who seeing that (a) and (b) match in appearance, uses them inter-
changeably, may lose his job. The camouflage novice, who knows the paints
are different, but fails to appreciate that they match in appearance under var-
ious conditions, may lose his life.
Pluralist Perspectives on Perceptual Error 235
the other domain. But again, these interdomain errors can go in either direc-
tion and can be equally costly. Recall the cases of the painting contractor and
the camouflage novice.
This is not to deny there are important distinctions between phenomenal
versions and physical versions.4 Nor is it to claim that both sorts of schemes
are equally useful in every area. The differences, though, are largely pragmatic.
The firm conviction of many psychophysicists that any lack of accord be-
tween phenomenal and physical judgments means perception is faulty de-
pends, I think, on a conviction that the physical version, the version in terms
of reflectance, is fundamental. Thus the function of vision must be to deter-
mine reflectance, since it is this physical property, not any phenomenal coun-
terpart, that specifies the way the world really is.
Elaboration and defence of a claim for privileging physics is highly prob-
lematic. There is a vast, non-conclusive, literature on reduction, theoretical
identity, and supervenience attempting to elucidate a thesis of ontological
priority. Other attempts have sought to establish the superiority of physical-
ist accounts on more epistemological grounds, with little success or even
consensus on approach. I am doubtful these ontological or epistemological
rankings can come to much when not drawn along pragmatic lines. But it is
not necessary to defend this assumption here. Privileging physics is compat-
ible with recognizing the value and need of other schemes of organization.
The issue is doubly irrelevant to psychophysics. The phenomenal ordering
and organization of the gray-scale provides the very rationale for its percep-
tual study. The physical property of reflectance would be of no concern to
psychology were it not for the way our perceptual system responds to it. If
psychophysics is to be an interesting domain of inquiry, psychological phe-
nomena and their accompanying judgments of appearance must be given
their due.
Further impetus for privileging properties like reflectance is the result of
some confusions concerning the subjective/objective distinction. Science does
strive to be objective, and so seeks to distance itself from biases and influences
that can intrude upon the quest for knowledge. Claims of post-modernists
aside, science is more than making up stories that are subjectively persuasive.
Theories must face the evidence and account for it in ways that meet stan-
dards of consistency, relevance, explanatory cohesion, simplicity, etc. And
even this is not enough, if a competing theory does the job better. Such
methodological scruples, however, do not preclude studying the structure of
Pluralist Perspectives on Perceptual Error 237
ready-made joints. These fears, though, are unnecessary. The Realist’s claim
that there can be a version that describes the world as it really is, independent
of the way it is conceived by any version must be dismissed. It lacks coherent
content or ends up postulating a Kantian realm of things-in-themselves hav-
ing no role to play.
That theories cannot be tested against an unconceptualized world: what’s
there, does not mean our constructions are unconstrained, that all accounts
are equally good, or that predictions and proposals cannot be evaluated for
truth or correctness. The categories used to order the world must do work
to earn their keep. Versions that do not organize the environment in ways
that serve intellectual and practical needs, as well as meet relevant norms of
inquiry, have no lasting claim on our understanding or imagination. More-
over, the thesis that versions are tested against other versions, is clearly at odds
with the idea that theories are unchallengeable constructions of the mind.
Nor does the pluralism of alternative schemes of categorization and the com-
peting versions they are used to express, preclude setting vigorous standards
and norms.
Notions of error are OK in their place. We do make mistakes within phenom-
enal and physical versions, and discordances between versions are real and
can bemisleading. Sympathy for not treating them all as error goes only so far.
It does not extend to denying that versions can be inconsistent, can mislead,
can conflict with better versions, or may not pan out in a host of other ways.
Talk of multiple adequate versions, along with the denial of there being a
version that gets at Reality unfiltered by any human contribution, can be lib-
erating. Unfortunately, the liberty is often misinterpreted. Many, we have just
noted, incorrectly assume pluralism entails there is no way to get things
wrong, that all versions are thus immune to objective criticism. A small, albeit
growing, number of psychophysicists take the opposite extreme, and along
with it reject the measuring device model. They accept the idea that we only
come to terms with the world via our versions of it, but then assume that we
can never really be in touch with Reality. Since all we know are our models or
(re)presentations, we are perforce always trafficking in illusions.
Labelling all our versions, both physical and phenomenal, as illusions may
be a nice trope, yet it does not have much literal punch. We can and do make
distinctions among versions. There is a difference between seeing a chair that
is actually there to be sat on, and hallucinating a pink elephant that is not
there to be fed. If I assert there is a chair straight ahead, I have said something
Pluralist Perspectives on Perceptual Error 239
true that will serve well to guide cognition and behavior. If I claim there is a
pink elephant a few paces away, I have uttered a false sentence, and I am de-
luded. If I continue to see pink elephants I run the risk of being hospitalized.
The cause of my hospitalization is an illusion, the hospital is not.
The obviousness of these last remarks make the thesis of pervasive illusion
itself seem like an illusion. Why is it, then, that vision theorists succumb
to it? I think the answer is that even proponents of this radical illusionist-
Idealist model harbor unrelenting Realist convictions. They correctly under-
stand we have no access to a world as it is, stripped or independent of the
perceptions and conceptions employed to order and organize it. Neverthe-
less, they cannot give up the idea that there is such a world. But then episte-
mological crisis is inevitable. We have no way of making contact with this
realm of things-in-themselves; all we have to go on are our (re)presentations.
Given that we only perceive such (re)presentational surrogates, we can not
truly be said to see the Real world. All experiences of the environment are thus
illusions. In turn, for all we know or can ever know our theories may be
wholly at odds with the-way-the-world-is.
The solution to this skeptical dilemma is to let go off the Realist intuitions
causing the trouble. There is no escaping our perceptions and conceptions so
as to confront the ready-made world head on, as it really is. For there is no
clear sense what this could be. Any attempt to articulate the nature of such a
confrontation, to fill in the details, will of necessity result in just another ver-
sion, perhaps one from a purportedly more lofty metaphysical perspective,
but a version nonetheless. This inability to step outside ourselves does not
mean our versions are empirically untestable or myths. There are important
distinctions between versions that are illusions and those that are not, be-
tween versions that are fact and those by intent or inadvertence are fiction,
between versions that are correct versus those that are in error, between ver-
sions that work and those that stand in the way of advancing understanding.
The account of psychophysics being recommended has strong affinities to
pluralist, Irrealist ideas Nelson Goodman has long defended. The position
has much in common, too, with classical Pragmatism. And adopting it does
require sacrificing cherished doctrines. It entails forgoing a quest for cer-
tainty, freeing up views about truth, and tolerating a pluralism of versions.
Still the losses are tolerable, and I, at least, do not see a better option. In psy-
chophysics the main alternatives seem to be either to adopt a Realist measur-
ing device metaphor or an Idealist world as illusion metaphor. I believe
240 Missing the Real Point
Notes
* This paper was written while a Fellow at the Zentrum für Interdisziplinäre Forschung
at the University of Bielefeld. I wish to thank the Center for its support and the mem-
bers of the research group for their input. Several members should recognize sketches
of their own position being examined.
1. The issues to be considered are closely related to current heated debates in the philo-
sophical literature over the nature and perception of chromatic color. (Hardin 1993,
Hilbert 1987, Thompson 1995.) Space limitations preclude my spelling out these
affinities.
4. Talk here and before of a difference between the phenomenal and the physical is not
meant to suggest an ontological or metaphysical divide. Phenomenal versions and
physical versions offer alternative frameworks for description and prediction. (a) and
(b) may be phenomenally the same and physically different, and such cross categoriza-
tions are all that concerns me.
5. I have developed these arguments further in Schwartz 1986 and have explored some
of the ramifications for a theory of spatial perceptions in Schwartz 1994.
References
Hilbert, D. 1987. Color and Color Perception. Stanford: Stanford University Press.
Schwartz, R. 1986. “I’m Going to Make You a Star.” Midwest Studies in Philosophy 11.
———. 1994. Vision: Variations on Some Berkeleian Themes. Oxford: Blackwell Publishers.
———. 2004. “Avoiding Errors About Error.” In Colour Perception: From Light to Object,
R. Mausfeld and D. Heyer (eds). Oxford: Oxford University Press.
Parts of this essay started life as comments on Michael Thau’s “What is Dis-
junctivism?” at the 35th Oberlin Colloquium in Philosophy. Both his paper,
only a small part of which was presented, and a version of my comments were
published in Philosophical Studies (120, 1–3, 2004, pp. 193–253, 255–263).
Thau’s paper has two primary aims: (i) a critique of Austin’s attack on Ayer
in Sense and Sensibilia and (ii) a rejection of McDowellian disjunctivism in
favor of Thau’s own solution to the “objects of perception” problem. In chap-
ter 15, I largely leave aside Thau’s paper and focus instead on the framework
of the disjunctivism issue itself. Although whole paragraphs are lifted from
my published paper, this new essay explores issues not touched on and de-
velops lines of thought only indicated.
Disjunctive perplexities about the objects of perception, stand in nice con-
trast to the issue discussed in chapter 12 on the perception of objects. The latter
continues to provoke discovery of interesting empirical phenomena even
when theoretical claims do not accord well with the notion of an “object”
employed. Qualms with the “objects of perception” debate are different. The
positions defended are constrained minimally, if at all, by studies of vision.
They are instead responsive to the epistemic, linguistic, and metaphysical in-
tuitions of each participant. Everyone gets to champion his or her favored
solution without being much bound by common sense beliefs, empirical
evidence, or substantive theoretical demands. Austin, I think, has it right in
Sense and Sensibilia. Scrap the philosophical staging that gives rise to the issue.
For specific needs and local purposes “object of perception” talk can be clear
and useful, but nothing especially significant follows from these practices.
The main goal of chapter 15 is to support Austin’s effort to deconstruct the
problematic. Themes and arguments encountered earlier in this volume re-
verberate throughout the essay.
15 An Austinian Look at the “Objects of Perception”*
Those who . . . revolt against a dichotomy to which they have been addicted, com-
monly go over to maintain that only one of the alleged pair of opposites really exists at
all. . . . [and then preach] with the fervour of a proselyte a doctrine of “one world.” Yet
what has ever been gained by this favourite philosophical pastime of counting worlds?
And why does the answer always turn out to be one or two, or some similar small, well-
rounded number? Why, if there are nineteen of any thing, is it not philosophical?1
J. L. Austin
I first read Austin’s Sense and Sensibilia at a time when it was pretty much a re-
quired text for anyone wishing to be philosophically informed.2 Like other
readers it seemed to me that various of Austin’s verbal barbs were not only a
bit condescending, but they seemed to miss the mark of their intended target.
I was frustrated, too, by Austin’s brief, end of the book treatment of Berkeley,
as told to him by Warnock. I thought that in focusing on epistemological is-
sues, Austin, like other critics, failed to appreciate the significant contribu-
tion Berkeley’s ideas had on the scientific study of vision. Still, I found the
book an exhilarating read.
What I liked most about Sense and Sensibilia is that it provided a rationale
for ignoring certain philosophical problems then in vogue while maintaining
a reasonably clear conscience. Austin showed, to my satisfaction at least, why
these metaphysical quandaries were not issues one needed to address or take
a stand on. The way I read and continue to read Austin is that he is not so much
trying to refute the Argument from Illusion and its kin, but, to put the matter
in modern terms, he is trying to deconstruct the whole problematic. Reminis-
cent of James and Dewey before him, Austin thinks that the epistemological
and ontological assumptions that breathe life into these problems of percep-
tion rest on untenable dualisms. He says at the start that “It is essential here,
244 Missing the Real Point
language. Austin is well aware that sound scientific discourse frequently moves
beyond and may justifiably contravene everyday talk. That the physicist’s use
of the term “mass” is not that of the masses is no cause for concern. Austin
does tend to put stock in the pronouncements of the O.E.D., but he thinks
there are reasons to do so. Austin believes ordinary language evolves to meet
actual needs, and the subtle distinctions found in the entries of the O.E.D.
can reflect the culture’s efforts to cope with these demands. For instance, the
different dictionary entries for “unintentional,” “accidental,” and “inadver-
tent” are significant, because they capture distinctions that are important in
a number of social and legal contexts. Austin is convinced that all too often
philosophical jargon, unlike scientific, legal, and serious everyday talk, is not
substantively constrained by real needs. It earns its keep taking in the wash of
other equally dubious philosophical vocabulary. The notion of an object of
perception is an illustrative example.
Perplexities over objects of perception have been said to start early with
Plato’s claim in the Theatetus (160, b) that “whenever I come to be perceiving,
I necessarily come to be perceiving something; because it’s impossible to
come to be perceiving, but not perceive anything.” Once this principle is
adopted, however, questions about the status of misperceptions immediately
arise. In particular, what is it that is seen when a person hallucinates? One re-
sponse to the question is to deny its presumption. Hallucinations, are not in-
stances of “real seeing.”5 This move has some support from intuition and
ordinary language. Unfortunately, intuition and ordinary language also en-
dorse conflicting stances. For many, the idea that hallucinations are instances
of seeing (or that seeing is constitutive of the concept of visual hallucination)
is so compelling that abandoning Plato’s principle is hardly worth consider-
ing. After all, hallucinations can be phenomenally indistinguishable from
illusions and veridical visual experiences.
I, like Austin, am not totally clear what the problem of the objects of per-
ception comes to and much less clear what the constraints are for resolving it.
I can imagine pressure or support for a particular answer flowing from work
in visual theory. For example, the claim that perception is a two-step process
in which experienced sensations trigger perceptions has been taken by many
to postulate something akin to uninterpreted objects of perception. J. J. Gib-
son, to name one prominent twentieth-century vision theorist, so under-
stood the model, and his theory of direct perception is meant to challenge it.
(See chapters 1 and 8.) According to Gibson perception is non-inferential; it
An Austinian Look at the “Objects of Perception” 247
does not depend on interpreting prior sensations. Thus Gibson claims that
his theory of perception supports perceptual Realism. No veil of sensation
stands between the world and perception of the world.6 David Marr’s notion
of a “primal sketch” and his levels of representation model have been thought
to raise comparable issues within computational theories of perception. Con-
cerns such as Gibson’s and Marr’s about the workings of the visual system,
however, seldom play a significant role in the philosophical objects of per-
ception controversies.
I also understand that problems in semantic theory may provide con-
straints on an answer to certain questions about the objects of perception. A
main goal of semantic theory is to assign logical forms to discourse so as to
capture accepted patterns of inference. For this purpose analyzing “see” or
“perceive” as two-place predicates may be best. Semantic questions of logical
form, though, do not seem to be at the core of objects of perception debates,
and it is good that they are not. The issue of logical form, in and of itself, is sev-
eral steps removed from substantive conclusions about the workings of the
world. That “height,” for instance, is treated as a two-place relation between
a person and a number does not provoke metaphysical worries about the
interaction of physical objects with abstract ones. Similarly, early discussions
of the logical form of statements of propositional attitude (such as those of
W. V. Quine and I. Scheffler) make it clear that treating attitudes as two-place
relations between subjects and sentences does not entail anything about a
subject’s possession or use of language-like entities. The same holds for sen-
tences about seeing. That it is logically perspicuous to analyze “Samantha is
aware of/has a thus and so experience” as a relational statement does not en-
tail there is some “thus and so” item that Samantha has on hand to inspect,
experience, or employ in visual processing.
Finally, the objects of perception problem cannot merely be to show that all
visual phenomena may be lumped into a single category rather than a dis-
junction of categories. The aim must be to show what can be better accom-
plished dividing them one way rather than another. For this task, ordinary
language and intuitions of principles do not seem to provide a firm guide. And
even if they did, why should these considerations have much binding force?7
A brief look at the standard tripartite division of visual phenomena into hal-
lucinations, illusions, and veridical perceptions may help indicate why.
Hallucination, it is often said, is distinguished from ordinary mispercep-
tions in that there is no physical object that is being seen. But is this so? In
248 Missing the Real Point
discussing delusions, Austin mentions that there are two accounts of mirages.
One holds they are influenced by atmospheric refraction (perhaps due to the
presence of mist); the other maintains that this is not a factor.8 Are mirages,
then, hallucinations on the latter account and not hallucinations if refrac-
tion enters into the story? Might the mist itself be the object of perception in
spite of our being totally unaware of its presence? In any case, in hallucina-
tions there may very well be something physical that is seen even in cases
where atmospheric conditions do not intrude, namely, the desert environ-
ment that sets the backdrop for the imagined oasis. So are there two ontolog-
ically distinct objects in such hallucinatory experience, the immaterial oasis
and the material desert landscape?
Puzzles arise as well with accounts of perceptual “filling-in.”9 Apparent mo-
tion phenomena are typically classified as illusions. If in a dark room a square
figure and a circular figure are shown one after the other in time, subjects see
an object move across the spatial gap between them, transforming in shape
along the way. Of course, these apparent motion experiences have external
causes. Less clear is what, if anything, is being misperceived. Is it the square,
the circle, both, or the unoccupied dark space lying between them? If the
last, would that make apparent motion a hallucination? Alternatively, might
it be held that nothing, in fact, is being misperceived?10 (See chapter 7 on
visual supplementation.)
Filling in across the blind-spot raises related questions. Light striking the
retina at the blind spot has no appreciable perceptual effect. The filled-in ex-
perience is the same independent of the source of the light that strikes this
part of the retina. The light could be coming from an object corresponding to
the phenomenal supplementation, or from a non-corresponding form, or
from a blank surface. Indeed, the experience will be the same if no light hap-
pens to strike the retina at the blind spot. So are filling-in experiences veridi-
cal in some cases, illusions in others, and hallucinatory in others? Is there a
need to postulate immaterial objects to explain the phenomena? And do any
of these considerations tell for or against Plato’s principle?
The notion of veridical perception is equally fuzzy. Most everyone agrees
that there is an important distinction between seeing things correctly and
seeing them incorrectly. Also most everyone, including Austin, would grant
that we make rough and ready distinctions between getting things right and
getting them wrong. Austin, however, doubts there is a determinate full-
bodied notion of veridical perception underlying these judgments, and I
An Austinian Look at the “Objects of Perception” 249
think there is good reason for his skepticism. It is no easy task to specify how
and to what extent ordinary perception truly grasps the facts or corresponds
to them in content.
In discussing veridicality, we usually have in mind feats of recognition or
categorization. Is the item in front of us a tomato, that over there a twig, not
a snake, and the stick in water straight, not bent? Such tasks, though, consti-
tute only a small part of perceptual activity. Suppose, instead, attention turns
to more metric spatial properties of the layout. People are not all that good at
judging size, shape, and distance in an absolute sense. When the comparison
items are spatially much apart, relative assessments, too, tend to be inaccu-
rate. Does this mean everyday visual experience is rife with misperception
and illusion? Claims of veridicality depend as well on how correctness is mea-
sured. My cognitive estimate of a given distance may be faulty, although I can
throw a ball right to the spot. And even when spatial judgments are on target,
how much is due to perception being veridical and how much to mental cor-
rection? If asked, I will judge that the stick in water is straight. Similarly, if
asked, I will tell you that the person walking away from me remains the same
size (approximately six feet tall) although his appearance grows smaller and
smaller. Yet were the person approaching, not retreating, I am likely to refrain
from making any size judgment until he comes quite close.
Color perception is another area where the issue of veridicality is not free of
difficulty. As discussed in chapters 12 and 13, there are problems in the rela-
tively simple case of achromatic colors (the grays from black to white). The
idea that an experience of a given shade of gray paint presents the gray as it
physically is or as it should be seen is of questionable sense. You always need
a background and there are no neutral backgrounds. Standard lighting con-
ditions, or those used to calibrate the Munsell color charts, are not the best or
ideal ones for discrimination. Also comparative judgments made in certain
setups said to engender illusory color experience can actually aid, not hinder,
discrimination. Yes, in particular contexts, for specific purposes, a rough and
ready labeling of perceptual experiences into veridical, illusory, and halluci-
natory may be of service. It is quite another story to assume that such dis-
course demonstrates that a unique, theoretically useful division of visual
states into veridical perceptions, illusions, and hallucinations is needed or
can be justified in terms of the processes, mechanisms, or functions of vision.
If neither empirical and conceptual considerations of vision theory nor
those of semantic theory substantially constrain solutions to the objects of
250 Missing the Real Point
perception puzzle, what can? An obvious answer is that constraints can flow
from the demands of epistemology.11 Here again, Austin is skeptical. He be-
lieves it is largely the adoption of habitual, albeit ill-advised, dualisms that
keep the issue afloat. The analysis of the notion of “perceptual inference” of-
fered in section 2 of this volume and in VVBT lead me to side with Austin.
Philosophical solutions to the objects of perception puzzle all too often as-
sume something along the lines of a hard and fast given/taken dichotomy.
Most visual experience occurs with some stimulus to the system. I have ar-
gued, in the works cited above, however, that there is no single state or event
in the causal chain that can be deemed the fixed dividing line between input
and output, premise and conclusion, or vision and cognition. (See also chap-
ters 11 and 12.) Trivially, no input on its own is wholly responsible for the
character of visual experience. Visual experience results from contributions
of both the environment and the perceiver, and these contributions are inex-
tricably joined. What the environment gives can have no effect on percep-
tion, unless it is selected and taken by the visual system. This holds no matter
how far out into the environment or how far upstream past the retina one
searches in the causal chain. If inputs cannot be accommodated and put to
work, there is nothing useful on offer. The given of necessity is response de-
pendent; it is determined in the taking.12
There are, no doubt, differences worth noting in the degree to which the
properties of an input constrain the specifics of an output. For instance, in
cases like the mirage oasis the environment minimally shapes the qualities of
the visual experience. Were there an actual physical oasis in full view, the in-
put would have a much greater say in the properties of the output. Neverthe-
less, no place along the causal chain is inherently the point of origin of
perception, and no single output is in principle its final stage. There are can-
didates in-between and beyond, and with further elaboration and changes
in the story the intuitions and categorizations will shift. Of course, where
and when there is a particular theoretical need for a distinction, science
undoubtedly will find or stipulate one.
Does this mean that anything or any stage in the causal chain may be said
to be an object of perception? Not without stretching the bounds of everyday
intuitions and ordinary language practices. But what if we aim higher or dig
deeper in the hope of uncovering what the object of perception really is?
Austin, I believe, would suggest that it is better to abandon the concept object
of perception than to search for an answer. Starting down that line only leads
An Austinian Look at the “Objects of Perception” 251
Notes
* I wish to thank the members of the UWM philosophy faculty workshop for com-
ments and spirited resistance.
1. “Intelligent Behavior: A Critical Review of The Concept of Mind” in Ryle O. Wood and
G. Pitcher (eds.), New York: Anchor Books, 1970.
4. Although once the issue is probed much below the surface, problems do arise spelling
out the sense and implications of this everyday discourse. (See chapter 13.)
6. For references and earlier discussion of this issue see R. N. Hanson’s chapter “Obser-
vation” in Patterns of Discovery, Cambridge: University of Cambridge Press, 1964.
7. I am not denying that linguistic practices and conceptual intuitions can be brought
to bear. I am questioning the significance and force of their verdicts in this case.
8. I deviate somewhat from Austin’s actual mirage discussion. He does not discuss mist
as a factor.
9. I leave aside disputes over the best way to characterize the notion “filling-in.”
10. Note, apparent motion type processes underlie experiences of movement in films,
but in most contexts it is not common to talk of these experiences as misperceptions.
12. The issue raised is analogous to those long-discussed in visual theory concerning
the proper understanding of the notion of “stimulus.”
14. Also see J. Koenderink, “Multiple Visual Worlds,” Perception 30, 2001, 1–7.
15. I do not deny that there are significant problems concerning consciousness that
can be and need to be addressed, along with “what it’s like” worries that need to be
defused.
16. James and Dewey do offer an alternative perspective—a pluralism of useful ver-
sions, none privileged and none representing Reality ready-made. More recently,
N. Goodman advocates such a position in Ways of Worldmaking, Indianapolis: Hackett
Publishing, 1978. I have developed ideas along this line in “I’m Going to Make You a
Star,” Midwest Studies in Philosophy 11, 1986, 427–39 and “Starting from Scratch: Mak-
ing Worlds,” Erkenntnis 52, 2000, 151–159.
Index
Alberti’s Window, 160, 178–179, 183 Sense and Sensibilia and, 241, 243–245
Armstrong, D. M., 18–19 veridical vision and, 248–249
Art Ayer, 241
Alberti’s Window and, 160, 178–179, 183
caricatures, 161, 163, 167, 169, 177–178 Benson, J., 200
Cubists, 151, 161, 163, 167, 169, 177, Berkeley, Bishop, 1, 11
179, 182 color and, 15–16
distortion and, 162 convergence and, 24
occlusion and, 109–110 critics on, 13–14
painting, 151, 160–162, 169–170, 178– dimensionality and, 18–19
179, 183 distance evaluation and, 14–15
photography, 162, 177 An Essay Towards a New Theory of Vision,
picture perception and, 3–4 (see also 2, 13–16, 19, 49, 67, 71–87
Picture perception) heterogeneity and, 55–67
projectivists and, 159–170 immediacy and, 14–17
realism and, 150–154 inference and, 2–3, 103
resemblance and, 148–151 inseparability thesis and, 65–66
station point and, 160–161, 181 intuition and, 18
symbolic paradigm and, 164–168, inverted image and, 18
173–185 Kantian approach and, 22–23
Atomic places, 40 Kaufman model and, 24–25
Auditory stimuli man born blind (MBB) test and, 71–87
heterogeneity and, 56–59 minima visibilia and, 40–49
man born blind (MBB) test and, 82 minimum sensibile and, 35, 37–50
simultaneous sounds and, 57 misunderstanding of, 13
Austin, J. L., 5 Molyneux problem and, 55, 62, 69, 71
Gibson and, 246–247 one-point argument and, 18–19
hallucination and, 247–248 psychic approach and, 15–16
object perception and, 243–253 size perception and, 29–33
real notion and, 245 smell and, 17–19
256 Index
Man born blind (MBB) test (cont.) geometric points and, 41–42
Mach and, 72 heterogeneity and, 55–57, 60–61,
Mill and, 72–73 65–66
necessary connections and, 72–73 judgment and, 38–39
olfactory stimuli and, 74 metrics for, 39–40
perspective and, 76 minima visibilia and, 40–49
phenomenal ordering and, 74–75 orientation and, 41
Schwartz and, 84–85 role of, 40
shape and, 83–87 sensory qualities and, 38
size and, 83 smell and, 38
spatial perception and, 75–78 spatial perception and, 41
tactile stimuli and, 74–75 tangibilia and, 38
thought experiments and, 71–72 taste and, 38
Marr, David, 126, 130, 247 Mirages, 247–248
Massaro, D. W., 130–131 Moked, G., 37
Mathematics, 41–42, 55, 60. See also Molyneux problem, 55, 62, 69
Geometry Evans and, 80–84
directed perception and, 124 man born blind (MBB) test and, 71–87
heterogeneity and, 58–59 Mona Lisa, 181
object perception and, 203 Moon illusion, 27, 76–77
Mausfeld, R., 209 Movement, 93, 181–183
Metaphysics, 132–133, 227 Munsell condition, 216, 218, 223–224
Mill, J. S., 72–73 Music, 166, 174
Minima tangibilia, 60 heterogeneity and, 58–59
Minima visibilia resemblance and, 145–147
characterization of, 40 symbol systems and, 145–147, 166, 174
color and, 44–45 Mystery of the Moon Illusion, The: Explor-
dimensionality and, 48–49 ing Size Perception (Ross & Plug), 27
experience and, 44–47
geometry and, 41–42 Nativism, 96
heterogeneity and, 60–62 Necker cube, 93
intersubjective comparisons and, 47 Number, 58–59
phenomenal location and, 42–43
shape and, 43–44 Oberlin Colloquium in Philosophy, 241
size perception and, 45–47 Object perception, 5
Minimum sensibile, 35 animals and, 193–194
audition and, 38 Austin and, 243–253
characterization of, 40 body concept and, 193–200
color and, 38–39 causality and, 250–251
conceptualizing of, 37–38 color and, 194
distance and, 40 computational tasks and, 192–193
experience threshold and, 41 constancy for, 198–199
field magnitude and, 39–40 debate over, 241
Index 261
Van Gogh, V., 181 “Way the World Is, The” (Goodman), 164
Vedeler, D., 130 “What is Disjunctivism?” (Thau), 241
Vishton, P., 109–110 Wheatstone’s sterescope, 14, 19–25
Visibilia Whittle, Paul, 209
heterogeneity and, 56–57, 60–62 Wollheim, R., 176
minima, 44–47 Wyszecki, G., 213
phenomenal location and, 56
tangibilia and, 56–57 ZiF. See Center for Interdisciplinary
Vision, 1 Research (ZiF)
Berkeley and, 11–26 (see also Berkeley,
Bishop)
blind spots and, 248
directed perception and, 123–135
error and, 211–228
filling-in and, 248
hallucination and, 247–248
inference and, 95–105
inverted images, 18, 75–76, 79–80
man born blind (MBB) test and, 71–87
minima visibilia and, 40–49
minimum sensibile and, 38
object perception and, 191–207, 243–
253 (see also Object perception)