You are on page 1of 14

1

DEEP BACKGROUND

Massimo Piattelli-Palmarini
University of Arizona
Spring 2002

Introduction:

Goodmans chapter is called "The new riddle of induction". What was the
old riddle? It was Humes problem of justifying induction. Justifying induction,
that is, on deductive grounds. But that proved to be impossible. Any induction
can only be "justified" on the basis of another induction (that there is uniformity
in Nature, that like effects are derived from like causes, that one phenomenon is
sufficiently similar to another phenomenon etc.). Hume concluded that the only
justification is the formation of habits of the mind that lead to useful predictions
more often than not.
Why deal with Goodmans Paradox (GP) in the study of linguistics?
Because it concerns induction and therefore learning. As Fodor has rightly
stressed ever since 1974 (in Psychological Explanation, and again in The Language
of Thought, and in the debate with Piaget, and in Concepts), the only theory of
learning that we remotely understand is a theory of how hypotheses are
confirmed or refuted by experience. This applies, with bells on, to the learning of
concepts, and/as the learning of lexical meanings. Everyone (I really mean
everyone, in any field) who tries to explain a cognitive competence as the
product of "learning", has to solve GP for that competence. Notably, he/she has
to offer explanations for (at least) the following:

(1) a: Where do the hypotheses that the learner puts to test (submits to
the "tribunal of experience", in Quines words) come from?
b: Which ones are tested first, and why?
c: Which ones are never even tried out, and why?
d: What counts as "confirmation" of a tested hypothesis, and why?
e: How many confirming data are needed, for the hypothesis to be
finally adopted?

Chomsky, in the debate with Piaget, counters the suggestion that


grammars can be, in any meaningful sense, "learned", by asking how GP can be
solved for grammars. No answer was given then, and none has been given ever
since. A more recent variant of GP in the domain of Artificial Intelligence is
called "The Robots Dilemma". It has proved to be equally insoluble.
2

GP in essence:

Every finite number of experimental data (every finite amount of


experience) is compatible with an infinity of different, mutually incompatible,
hypotheses. No learning is possible (in a finite time) without a severe a priori
selection of the hypotheses that are worth being tested. And without a low
threshold for the amount of data that is arbitrarily taken to be sufficient to
confirm or disconfirm a given hypothesis.1 In fact, thats called "epistemic
boundedness". In Fodors rendition "No doubt the spider thinks that the web
exhaust all options". But lets be reminded of his less jocular remarks on
"triggers" and parameter fixation: If the fixation of parameters is practically
instantaneous, and only needs one relevant instance, then, in principle, any
"trigger" could do the job. We still have to account for the relevance of the
instance to do the job of fixing that parameter.

Closer to us:

The empiricists theory of the learning of the lexicon by the child invokes a
"pairing" between a sound and an external stimulus. (The pairing of a word to a
referent)2. A series of episodes of conspicuous ostensions accompanied by the
simultaneous utterance of a word are alleged to be the core of the process. But
(see Bloom, among other authors) this is a very special case, at best. Because (just
to name a few counters):

(2) a: The most common situation is one in which the new word is
embedded in a sentence;
b: The referents of many words have no sensory counterpart (uncle,
yesterday, remembering etc.);
c: The attention of the speaker and of the child must be jointly
focused onto something in an obvious way (a "theory of mind" is
crucially presupposed);
d: The presence of the referent and the utterance of the word are
usually not simultaneous (Where is my compass? Not here, not
here. Ah, here it is!). The mother is not making "an ongoing
commentary" (Gleitmans term) of what is happening, while its
happening. Most of the time referents are evoked and not
"presented".

1 In short, we have to make radical assumptions that rule out most of the possible answers to a
given problem before deciding which of several closely related possible answers are correct -- we
already 'know' where to look for the answer. In fact, we already know nearly everything. Where
does that knowledge come from? If it's innate, with respect to some kinds of problems people
may forever be like the drunk looking for his keys under the lamppost.
2 Humes formation of habits of mind.
3

But, lets take the (admittedly rare) case of a canonical "presentation-cum-


utterance". This is a boot, uttered while showing an exemplar of the thing. GP
applies with full force. The child might well understand the meaning of boot as
(just to name a few)

(3) a: The left boot only (if what has been shown is a left boot);
b: A brown boot only (if a brown exemplar has been shown);
c: That object plus about 1 inch of air surrounding it;
d: That object plus about 2 inches of air surrounding it (and so on ad
infinitum);
e: The undetached sole of the boot;3
f: The undetached tip of the boot.

And so on, and so on. Why no child ever entertains those fancy
hypotheses, and "gets" the meaning right, from the very start? Our suggestion:
There are powerful innate pre-delimitations on the possible meanings of words,
and on the ways words map onto things and events in the world, and on what
Mother "intends to convey".

Goodmans own solution:

Its best explained by means of an example that I offered to him (as a


reductio ad absurdum), and that he liked very much (as a real possibility).
Imagine a tribe (the deciduans) who live in the middle of a vast forest exclusively
formed by deciduous trees. They have never gone out of that forest. To them, in
their language, there is only one word that describes the color of foliage and of
the leaves: decidual. Its a primitive concept to them. In our language, it
translates as "Green, if, and only if observed in the Spring or Summer, or yellow-
to-brown, if, and only if observed in the Fall". Symmetrically, in their language,
our word "green" translates as "decidual, if, and only if, observed in the Spring or
Summer". Our word "brown" translates as "decidual , if, and only if, observed in
the Fall", and so on. Notice ( la Goodman) that in our language their term
decidual is complicated, disjunctive and referring to seasons. But in their
language decidual is simple and primitive, and refers to no season at all. In their
language, on the contrary, our words "green" and "brown" are complicated,
disjunctive, and referring to seasons. The symmetry is perfect. Its language-
relative. Well, of course, to me this is absurd: from everything we know no such
concept can be the meaning of single word in any human first language (though
it could in the specialized lexicon of mathematics, or carpentry or sailing). But, to
Goodman, this was a wonderful example of a real possibility (he urged me to

3 cf. Quines Ghavagai problem.


4

publish this example). He, in fact, was the staunchest cultural relativist, and
nominalist-pragmatist that there ever was. In his opinion, a human group was
free to invent any language it pleased, and see it adopted by everyone, through a
tacit convention.

Far background of GP:

The "beautiful dream" of the neo-positivists was to make science rigorous


by grounding it exclusively onto replicable and well-measured sense data,
described by a precise logical language, and draw inferences only on the basis of
deductive logic.

The first glitch, historically:

This is known as Nicods Paradox.4 He considered the "universal law".

(4) All ravens are black

But we may consider any universal scientific law of the form.

(5) All Xs are Ys.

Let R be the property "is a raven". Let B be the property "is black". All
ravens are black has the logical form of the logical conditional (ever since
Bradley)

(6) x [ R(x) B(x) ]


Every object in the universe is such that, if it is a raven, then its black.

I go around and make many observations to check whether this law is


confirmed or refuted.
If the neo-positivists are right, then such observations should confirm or
refute together all logically equivalent statements of this law.

The pieces of Nicods puzzle:

In virtue of logic, (7) repeated from (6) is equivalent (strictly, perfectly


equivalent) to the statement in (8).

(7) x [ R(x) B(x) ]


Every object in the universe is such that, if it is a raven, then its black.

4Jean Nicod was a great French logician and mathematician, who invented this paradox around
1930.
5

(8) x [ B(x) R(x) ]


Every object in the universe is such that, if it is not black, then its not a raven.

I go around and observe white sheets of paper, red trucks and my


grandmother. If the neo-positivists are right, then such observations should
confirm the hypothesis that all ravens are black.
These two conditionals (thse two laws) are logically equivalent. So we
have a big puzzle: a paradox.
The observation of any non-black object that it is not a raven, confirms the
hypothesis that all ravens are black.
Nelson Goodman called this "indoor ornithology". A mockery of good
science.

What is the solution?

Intuitively (and correctly) we judge that the relevant observations have to


be made on birds (possibly birds that are very similar to ravens, and ravens
themselves).
We need (at a minimum) criteria of:

(9) a: Relevance
b: Category
c: Similarity

But these, alas, are not logical concepts. We can say a lot about these
notions, but not by means of logic alone.
Goodman stresses the difference between "this piece of copper conducts
electricity" (naturally leading to expect conductivity in other pieces of copper)
and "this man is a third-born" (leading to no expectation at all in other men in the
room). His account of the different "projectibility" of these two predicates relies
on a property of evidential attributes that he calls "entrenchment". Its entirely
pragmatic, context-relative and collective. It is rooted on a history of past success
in the projection of predicates (for instance those that are based on "micro-
structure", versus those that are purely relational). We will not go into this, for
reasons of time, of relevance to this class, and because, in my opinion, this
account of "entrenchment" was never developed in a satisfactory way by
Goodman (or by anyone else).

The original, un-adulterated GoodmansParadox

Consider the universal hypothesis

(10) All emeralds are green.


6

I observe many gems, similar to emeralds, and more and more emeralds.
All emeralds I have observed so far are green. Is this a confirmation of my
hypothesis? Sure! But, is it a confirmation of my hypothesis only? NO! Goodman
points out that these observations also, and by the same token, confirm an
infinity of other hypotheses, all different from, and incompatible with, mine.

Goodmans"other" hypotheses

(11) All emeralds are grue.


Grue means: Observed before midnight tonight and ascertained to be green, or
otherwise observed and found to be blue.

All emeralds I have observed so far are grue.

So this hypothesis is also confirmed by the same data, as much as the


other hypothesis. Obviously there is an infinity of such hypotheses. Just change
"midnight tonight" with any other instant in the future.

Back to the deciduans

Notice that the two hypotheses:

(12) a: All emeralds are green.


b: All emeralds are grue.

Are two different and incompatible hypotheses as of now. They do not


"become" different and incompatible at midnight tonight. After midnight tonight,
I can test (check out) which one is correct. But they are different hypotheses "in
eternity".

First reaction:

The first reaction tends to be that the predicate "grue" is unnatural,


bizarre, messy, implausible etc. While the predicate "green" is simple, natural etc.
Of course, Goodman agrees. The point is that none of these considerations
belong to logic. We cannot make a logical distinction between the predicate
"green" and the predicate "grue". We "feel" that there is one, but this "sensation"
cannot be supported by logic alone.

Another try:

The property of being green is "formally" simple. The property of being


grue is "formally" more complicated. Its definition contains a disjunction ("or
7

otherwise observed"). And makes reference to a point in time in the future. Is this
not an "objective" (logical) distinction?
Goodmansanswer is: NO! It all depends on your theoretical vocabulary. It
depends on what your scientific language considers to be primitive:

(13) Two basic languages:


a: Human
Green and blue are simple and primitive.
Grue and bleen are disjunctive and contingent.

b: Goodmanian
Grue and bleen are primitive.
Then green and blue have to be defined in terms of grue and bleen.
In this language, green and blue are disjunctive and contingent. I
have to introduce disjunctions and an instant of time in the future
to define them.

The perfect symmetry:

(14) a: Human
Grue = observed before instant t and ascertained to be green,
or otherwise observed and ascertained to be blue.
Bleen = observed before instant t and ascertained to be blue,
or otherwise observed and ascertained to be green.

b: Goodmanian
Green = observed before instant t and ascertained to be grue,
or otherwise observed and ascertained to be bleen.
Blue = observed before instant t and ascertained to be bleen,
or otherwise observed and ascertained to be grue.

Logic cannot decide. It all depends on the theoretical language that you
decide is basic. Davidson refined this paradox.

(15) Emerire = Observed before instant t and ascertained to be an emerald,


or otherwise observed and ascertained to be a sapphire.

(16) All emerires are grue.


8

Is a perfectly simple universal "law" in this language. It is TRUE in our


world that all emerires are grue.
Try to define "emerald" and "sapphire" in this language (you will need
"sapphrald" too):

(17) Sapphrald = Observed before instant t and ascertained to be a sapphire,


or otherwise observed and ascertained to be an emerald.

In this basic language sapphire and emerald are disjunctive and


contingent. More precisely: The property of being a sapphire, and the property of
being an emerald, have a disjunctive and time-contingent definition in this
language.

A frequent misunderstanding:

The hypothesis "All emeralds are grue" does not say that these stones will
undergo a "change in color". The hypothesis does not contain an event that is an
event of color-changing. That is something extra. Not part of the hypothesis. That
would add a physiological hypothesis (the color suddenly looks different to a
human eye). And/or a micro-structural hypothesis (the molecular conformation
changes suddenly) But the original hypothesis does not say anything like that.

Goodmans solution of his paradox:

Green and blue are predicates that in our world are more "projectible"
than grue or bleen. There is nothing in the property of projectibility that is
"absolute" or "logical" (nor anything, in his opinion, that remotely connects to
innateness, or "human nature"). Projectibility is relative to our concerns, interests,
background knowledge, and on a history of "past success". It is an entirely
pragmatic notion Certain properties (predicates) are more "entrenched" than
others. They have a longer history of collective successes in being projected onto
new cases.

Goodmans lesson (and Hempels lesson):

Any finite set of data is compatible with an infinity of different hypotheses


(or curves). They are all compatible with the data, but incompatible between
them.
At 1am this night, green and grue become visibly incompatible
hypotheses. But they are incompatible even now, though I cannot (yet) decide
which one is correct.
Logic alone cannot decide. We introduce plausible considerations of:
9

(18) a: Simplicity
b: Elegance
c: Similarity with other phenomena
d: Other laws we believe to be true

Fine and dandy, BUT these are not logical criteria.5 Yes, but what is simple
and what is not, and what counts as "beyond what is necessary" are entirely
context-relative to Goodman.

Richard Jeffreys statistical example of GP:6

(19) a: As you know, about 49% of recorded human births have been girls.
What's your judgmental probability that the first child born in the
21st century will be a girl?

b: A goy is defined as a girl born before the beginning of the 21st


century or a boy born thereafter. As you know, about 49% of
recorded human births have been goys. What is your judgmental
probability that the first child born in the 21st century will be a
goy?

This question is meant to undermine the impression that judgmental


probabilities can be based on frequencies in a way that doesn't already involve
judgmental probabilities (exactly in the spirit of Goodmansremarks). Since all
girls born so far have been goys, the current statistics for girls apply to goys as
well: these days, about 49% of human births are goys. Then if you read
probabilities off statistics in a straightforward way your probability will be 49%
for each hypothesis: (i) the first child born in the 21st century will be a girl; and
(ii) the first child born in the 21st century will be a goy. Thus P(1)+P(2)=98%. But
it's clear that those probabilities should sum to (i), since (ii) is logically equivalent
to (iii) the first child born in the 21st century will be a boy, and P(1)+P(3) = 100%.
Contradiction.
What you must do is decide which statistics are relevant: the 49% of girls
or the 51% of boys. That's not a matter of statistics but of judgment -- no less so
because we'd all make the same judgment, P(H) = 51%.

5
cf. modern movie variants of Descartess Demon (as in the movies "The Truman Show" or "The
Matrix"). There is in principle no way to distinguish between the hypothesis that things are as
they appear to be and the hypothesis that things that appear to be do so only at the behest of a
capricious external power. -- although it might affect what kinds of events you imagine might
occur in the future. The intuition that accepting things at face value is 'simpler' than positing a
capricious external power, or that Ockham's Razor has any content at all, is the very heart of
Goodman's conundrum.
6
He is one of the greatest philosophers of probability, professor at Princeton.
10

Lets bring GP home:

I bet that the first reaction of some of you will be: (1) This is damn
important in many ways, to many problems (in fact to all problems of learning);
(2) How come I was never told about this paradox? The explanation is simple:
This is considered "stuff for philosophers" (and it is!), but psychologists, linguists
and other "social scientists" simply ignore this paradox. It figures prominently in
the work of Chomsky and Fodor, but one would be hard put to find any explicit
mention of it anywhere else. Bloom & other acquisition people are paying
attention to it with their hypotheses about the built-in assumptions that babies --
very young babies -- make about the world.

The mystery of acquisition again:

Super-simple case: learning a new word to describe a concrete object in


the immediate environment: (dog).

Granted: shared attention, intended referent, segmentation of speech


string into words, and identification of word with intended referent

STILL: many potential ambiguities.

(20) a: Is the word the name of an individual?


b: The name of a breed?
c: The basic kind?
d: The superordinate kind?
e: The shape?
f: The color?
g: The size?
h: A part of the dog?
i: To dogs until 2000 and then to pigs?
j: To dogs and pencils? (silly but real possibilities la Goodman)
k: To dogs and Richard Nixon?
l: To dogs-or-coyotes-or-wolves? This is the worst case (in the
abstract) because refutations are hard to come by and the (abstract)
number of possible disjuncts is horrendous
11

Some proposals for assumptions the child brings to the word-learning process:

(21) a: Whole Object constraint (Bias against h-k.)


b: Taxonomic constraint (Bias towards same kind)
c: Solid Object vs. Mushy Substance constraint (Bias against h.)
d: Shape constraint (Bias towards c, e)
e: The no synonyms constraint (Ellen Markman), see below

All of these seem like plausible candidates for mental constraints on


language learning. According to Bloom, none of them are actually true. In adult
language, there are plenty of words that violate all these constraints:

(22) a: Whole Object: happy, hit, under, water


Even restricted to count nouns: forest, bikini, flock, finger, foot,
handle, surface, coating, nap, idea, dream
b: Shape: army, family, animal, weapon, brother, friend
c: Taxonomic: Fred, Canada
d: Solid whole vs. mushy part: wood, metal, pile, puddle.

Biggest problem for all of the above: only work for nouns! But kids learn
more than nouns, and do so very early: more, bye, hit, want, up, no.
Survey of 20-month olds' vocabulary revealed that only half of their nouns
referred to basic kinds of solid objects. Other types of words included location
(beach), temporal entities (day) and events (party). Words like friend and uncle
also appear early, as do pronouns (yes, but they are special, see Laura Petittos
study on normal and congenitally deaf children. Persona pronouns appear all of
sudden with a transient brief period of referential inversion) and proper names.
Secondary issue: are these constraints supposed to be innate? Really seems
like they must be. Then, are they specific to the language-learning problem? Or
can they be shown to fall out from some other recognized system?
Bloom's proposal is essentially Gleitman's syntactic bootstrapping
hypothesis, applied to the domain of nominals.

(23) i: NPs refer to individuals


ii: Count Ns refer to kinds of individuals
(they occur with a, plural marker).
iii: Mass Ns refer to kinds of portions (they occur with some without
plural). In Italian an interesting case is "spazzatura" (garbage) a
feminine mass noun, and "rifiuti" (trash or garbage again) a real
12

masculine plural with (occasionally and rarely used) a singular ("un


rifiuto")

Also "mobilio" (furniture, mass noun) and "mobili" (pieces of


furniture), a real plural (uno, due, tre mobili). The "ostensions" may
well be the same in both cases.

(23-i) takes care of the problem of proper names and pronouns; because
they're NPs, they must refer to individuals.7

Some problems:

Non-referential NPs: 'it is raining', 'there's trouble brewing'. Even idioms:


'John kicked the bucket.' There is a claim that kids can't learn these until they
already know the non-expletive or non-idiomatic meaning of the pronoun or NP.
(Kids can't learn kit and caboodle, then?) It rains, but it does not matter (non-
referential it plus a referential it for events) quite puzzling. The Italian and the
Spanish child has to deal with this with SILENT pronouns.

b. Italian: what about languages where it's not clear syntactically what's a proper
name and what's not? See above. Quite puzzling. Or perhaps there's other clues.
Maybe, but its far from obvious. Gender is irrelevant (see above) Clitics
apply to both "Ne ho bevuta tanta" (acqua, mass) and "Ne ho bevuti tanti
(bicchieri, count). Raffi has interesting evidence that the "reflexive" "si" may have
a measuring out role wit h some verbs. But the determiner+proper name thing
would be worth looking into. Not in any obvious way. Seems to be like English
(un po, di piu, piu apply to mass, determinative articles to count).

c. Mass nouns in English: often look like NPs, not just Ns. In order to distinguish
between an N and an NP reading of water in a string like I like water, (vs. I like
Fido) the child would also have had to have heard and remembered, Give me
some water or similar (*Give me some Fido). That was Quines point: The
difference between water and "mama" (his examples) can ONLY come
after the child has mastered the quantifiers. But the experiments by Sue Carey
and Nancy Soja show that 2 + 1/2 year-olds immediately understand whether
what you show is an object (blinket) versus a stuff (dax) and the do not master
quantifiers. I pointed this out to Quine and, disappointingly, he told me he could
accommodate these date in his theory. He did not tell me how.

7
The special pronominal forms in the genitive in English constitute evidence for this analysis (as
do many other phenomena, but the genitive is positive evidence), because that 's affix attaches to
NPs, not Ns, and hence there are no irregular N genitives, unlike, e.g., plurals. However there are
"irregular" pronominal genitives (his, mine) as well as regulars (hers, theirs, its, yours).
13

d. 'Individuals' can't just be whole concrete objects: otherwise you'd be falling


into the same trap as above w/r to nap, joke, forest??, army??, day, dream,
conference dealing with children I think "birthday" is a better example.

e. Lack of mass/count distinction in Chinese or other lgs?

Chierchia has put forth the theory that mass is the default in Chinese, while
count is the default in English. I never understood how that may be, unless they
have different genes presiding over what is the default. Maybe some other cues
in the language suggest what is the "local" default. UNCLEAR TO ME.

9. a. But, claim some researchers, kids DON'T associate 'individuals' with count
nouns and 'portions' with mass nouns. That is, lots of experiments demonstrate
that the linguistic cue (i.e. 'N preceded by a' or 'bare N') overrides the semantic
cue (i.e. blob of amorphous stuff or well-defined solid object). Carey and Soja
observed that the determiner (a, the, some, more, another) is irrelevant to the 18-
months old, becomes relevant between 3 and 5 years of age, then becomes
irrelevant again (for other reasons) over that age. Surely "cake" (count) and
"dessert" (mass) may have the same ostensible referent. See also the Italian
examples above.

They conclude that sorting into mass vs. count categories is a purely syntactic
procedure, without any semantic repercussions or implications: the association
between 'individuals' vs 'portions' and 'count' vs. 'mass' happens later. Well,
something like that. But some basic "match" between what is presented and the
syntactic form must be active in acquisition. A rough and ready, fallible heuristic,
that does good duty 85% of the times.

10. Evidence that kids are doing some semantic categorization, differentiating
between II and III

a. They make mistakes more often with non-canonical mass nouns like furniture
and money, and less often with canonical ones like juice and milk. Well, the latter
are "STUFF", while the former are more abstract. Not surprising.

b. They preferentially choose a referent for an unfamiliar word (substance vs.


object) based on the syntax they hear. (sib vs. a sib)

c. They assume that a novel count noun in the presence of unfamilar amorphous
substance refers not to the substance but to the bounded pile of it. (Doesn't work
in reverse they really want words referring to discrete bounded objects to
refer to the object, not the stuff it's made of) Yep, thats Carey and Soja. But a
splintered fragment of teak(wood) would suggest that we are referring to that
stuff , not to that "thing".
14

d. The mass/count distinction was relevant in a study testing kids' assignment of


reference to ambiguous substances/sounds. (fep vs. a fep). (Nice study because
included non-material substances) Bloom and Sandeep Prasada with "ding-
dings" (count) versus "some dinging" (mass) in the presence of the sound of a bell
being repeatedly hit with a stick

You might also like