You are on page 1of 27

REVISED VERSION MARCH 2011

Distiibutional biases in language families


*
Balthasai Bickel
Univeisity of Leipzig
Stability oi instability [] is a mauei of competing foices.
(Nichols zccIzsI)
1 Introduction
ln hei piogiammatic papei on diveisity and stability in language (zccI), Johanna Nichols
sketches a theoiy of diachionic stability. One of the key insights of this theoiy is that degiees
of stability aie not self-contained indices of language change but the iesult of competing foices,
such as diachionic ieplication, boiiowing, substiatal eects, and univeisals. ln this chaptei l
develop and discuss methods foi estimating the iole of these foices on the basis of statistical
analyses of synchionic typological datasets.
At ist sight, the stability of a typological vaiiable seems established when, synchionically,
all oi neaily all membeis of a family have the same value on that vaiiable (e.g. all have post-
positions), and this is so foi many families. But such a bias can have veiy dieient souices.
Te bias can be caused by genealogical stability daughtei languages have oi tend to have the
same values because they inheiited these fiom the pioto-language. Alteinatively, the exact
opposite of this is also possible, and the bias can be caused by genealogical instability daugh-
tei languages have the same values because they all changed in the same diiection. Such a
* Paits of this papei weie piesented at the 1Iid Annual Meeting of the Societas Linguistica Europaea, Septembei I,
zc1c in Vilnius, and l also discussed some of the methodological issues piesented heie in my couise on quantitative
methods in typology at the DGfS-CNRS Summer Sool on Linguistic Typology, Leipzig, August, 1 - 1, zc1c. l am
giateful to both audiences foi questions and comments. Special thanks go to Taias Zakhaiko foi many veiy useful
comments, including the suggestion to use the Laplace estimatoi in Section . Some of the ideas go back to Bickel
(zccsa), wheie l used the teim skewing instead of biases. Te teim skewing is unsuitable because of possible
ambiguities with skewing in the sense of lopsided distiibutions. A pievious veision of the cuiient papei was
ciiculated in Januaiy zc1c. Apait fiom maueis of exposition and exemplication, the most impoitant change is
that families with few membeis aie now tieated like isolates and not like laige families (in iesponse to a question
iaised by Alena Witzlack-Makaievich). All computations piesented weie caiiied out in R (R Development Coie
Team zc11). Te ieseaich piesented heie was suppoited by Giant Ni. ll/sI IvI fiom the Volkswagen loundation.
DRAFT Maich 11, zc11
z Distributional biases
change could be caused foi example by iepeated diusion eects (keeping quiiks like ielative
pionouns at bay woildwide) oi by developing cognitively favoied stiuctuies (e.g. by favoiing
agent-befoie-patient constituent oideis). lf a bias can iesult fiom both stability and instability,
the question aiises how one can tell these possibilities apait.
ln the following, l piopose an answei to this question in teims of what l call the Family
Bias eory. l ist intioduce, illustiate and motivate the basic ideas of the theoiy based on
univaiiate distiibutions (Sections z 1). ln Section l exploie possible extiapolations to small
families and isolates, and in Section e l discuss extensions to multivaiiate distiibutions. Sec-
tion compaies the lamily Bias Teoiy with classical appioaches of genealogically balanced
sampling and Section s concludes the chaptei by discussing implications foi futuie ieseaich.
2 e Family Bias eory: the basic ideas
lf one suiveys language families (in the sense of a genealogical unit established by the Com-
paiative Method), one quickly notices that a family can be unifoim oi neai-unifoim (e.g. all oi
most membeis of the family have a dual), oi it can be diveise (some membeis have, otheis dont
have a dual, as in lndo-luiopean). ln othei woids, each family may oi may not show a bias
towaids a given featuie. Te extent and signicance of this bias can be established by standaid
statistical techniques, such as
z
-tests.
Biases of this kind may show up at any taxonomic level and with any time depth foi
example, wheieas theie is a bias towaids veib-nal clause stiuctuies at the stock level in many
old stocks of New Guinea, the same bias is found only at shallowei levels in bianches of lndo-
luiopean, wheie the stock as a whole is much moie diveise. ln the following, l use the teim
family as a geneiic teim foi any taxonomic level, and ieseive stock foi the highest pioven
taxon (following Nichols 1vvz, 1vva).
lamilies can be biased with iegaid to any kind of typological vaiiable they can be biased by
favoiing the piesence oi the absence of a featuie oi a featuie set, a ceitain value oi inteival on
a continuous vaiiable, oi some complex constellation of such chaiacteiistic. ln the following, l
use the symbol F to subsume all these possibilities.
Te basic pioposal of the lamily Bias Teoiy is that the synchionic distiibution of a typo-
logical vaiiable acioss families ieects distinct histoiical scenaiios. Specically, the natuie of
biases acioss families ieects two dieient scenaiios
A. Directional Family Bias: lf theie aie signicantly moie families that aie biased towaids
F than towaids non-F, this ieects universal pressure in the sense that the development
and maintenance of F is univeisally piefeiied ovei the development and maintenance of
non-F. Te total piopoition of families with a bias (in eithei diiection) as against families
that aie diveise indicates the lowei bounds of how stiong the univeisal piessuie is.
B. Non-Directional Family Bias: lf theie aie signicantly moie families that aie biased
iathei than diveise, but the bias is undiiected, i.e. an equal piopoition of families aie
biased towaids F oi non-F, the vaiiable tends to be genealogically stable in the sense
that F tends to be unconditionally and faithfully ieplicated and that changes fiom F to
non-F oi fiom non-F to F tend to be disfavoied.
DRAFT Maich 11, zc11
Distributional biases I
ln this, univeisal piessuie iefeis to any piinciple suspected to shape the stiuctuie of languages
piefeiied stiuctuies tend to univeisally develop moie easily than dispiefeiied stiuctuies, and
they tend to be univeisally maintained moie peisistently than dispiefeiied stiuctuies.' Teie
aie many ways of howuniveisal piessuie can woik foi example, by favoiing the most fiequent
paueins in discouise oi those that aie easiest to piocess in compiehension, oi they can be
based on moie abstiact piinciples like iconicity oi paiadigm symmetiy. Also, univeisals may
opeiate as selectois of vaiiants in language change oi as pathways of change themselves. ln
the following l gloss ovei all these dieiences and do not discuss the woiking mechanisms and
ultimate causes of univeisal piessuie. My inteiest is only in deteimining the kinds of eects
on the synchionic distiibution of families biases that one can expect if some kind of univeisal
piessuie is at woik.
Te piesence of a signicant diiection in family biases (Scenaiio A) is independent of the
ielative piopoition of diveise vs. biased families. Diveise families do not piovide evidence foi oi
against a univeisal. On the one hand, a family may be diveise because the pioto-language com-
plied with the univeisal (i.e. had the piefeiied pauein), and some daughtei languages moved
away fiom that pauein. ln this case, the family would be countei-evidence against a univeisal.
On the othei hand, a family may be diveise because of developments in exact opposition to this
and in line with the univeisal the pioto-language may not have complied with the univeisal,
and some daughtei languages have changed towaids the piefeiied pauein. Tis would favoi
the hypothesized univeisal. Unless we know the ielevant paueins in the pioto-language foi
suie (which we usually dont), both possibilities aie equally likely.
What the piopoition of diveise families does indicate, howevei, is the stiength of univei-
sals. lf despite a signicant diiection of the bias (many moie biases towaids F than towaids
non-F), theie is a ielatively laige piopoition of diveise families, this suggests that F tends to
change ielatively quickly, and that the univeisal piessuie consists in a ielatively low piobabil-
ity of change towaids F. Tis piobability must still be highei than the piobability of change in
the opposite diiection (foi else theie would not be evidence foi a univeisal diiection in biases
acioss many families). loi example, theie could be weak eects in language piocessing that
favoi ceitain paueins, but the piobability that the eects leave a tiace in language change aie
ielatively small, and so it takes many geneiations in many families foi the eects to become
visible in extant distiibutions.
lf the piopoition of diveise families is small and theie is a signicant diiection in the bias,
this means that univeisal piessuie is veiy stiong if a language deviates fiom the piefeiied
pauein, theie is stiong piessuie to coiiect this, and this quickly leads to unifoim oi neai
unifoim daughtei languages, all with the piefeiied pauein. Conveisely, once the piefeiied
pauein is established, theie is stiong univeisal piessuie not to loose the pauein.
Because, as noted, diveisity in families can aiise both with and without univeisal piessuie,
the piopoition of diveise families only appioximates the lowei bounds of the stiength of the
univeisal. Te ieal stiength is undeiestimated to the extent that diveise families in fact contain
' Tis is in fact nothing but a iestatement of the by-now classic view of univeisals as diachionic laws of type
piefeience, cf., among otheis, Bybee (1vss), Hall (1vss), Nichols (1vvz, zccI), Gieenbeig (1vv), Haspelmath (1vvv),
Maslova (zccca), Blevins (zcc1), Bickel (zcc), Maslova & Nikitina (zcc).
DRAFT Maich 11, zc11
1 Distributional biases
incipient eects of a univeisal. Howevei, as fai l can see, this extent cannot be estimated on
the basis of synchionic suiveys.
lf the piopoition of diveise families is small but theie is no signicant diiection, this is Sce-
naiio B a tiend towaids high copy delity fiom geneiation to geneiation. linally, a typological
vaiiable can of couise also show no signicant tiend in family biases, i.e. neithei a piefeience
foi families to be biased vs. to be diveise (Scenaiio B), noi a piefeience towaids F vs. non-F
within biased families (Scenaiios A). Such a situation does not suggest any paiticulai pauein,
and the cuiient distiibution is mostly the iesult of chance events in language change.
ln the following, l ist illustiate the two scenaiios in Section I and then piovide evidence
and aigumentation foi the theoiy in Section 1. loi illustiation, l iely on the genealogical tax-
onomy of Nichols & Bickel (zccv), and l only considei families with seveial iepiesentatives
in typological databases. lxtiapolations to isolates oi undei-sampled stocks aie discussed in
Section .
3 Illustrations of the theory
ln the illustiations l concentiate on families with at least membeis and as a ciiteiion foi what
l count as a signicant bias in a family, l choose a iejection level of 1c in a
z
peimutation
test. Tese thiesholds yield p-values that match the intuition that complete consistency in a
family (i.e. out of membeis have F) iepiesents a linguistically inteiesting bias (with p .ce).
Howevei, not much depends on these paiameteis, and the iesults aie similai if one chooses a
lowei iejection level oi limits suiveys to families with moie membeis (oi both).
Te examples in the following aie only meant to show how the two scenaiios play out in
teims of data distiibutions. Specically, if a dataset suppoits Scenaiio A, this suggests the exis-
tence of univeisal piessuie and invites fuithei ieseaich. But it cannot and does not demonstiate
oi piove such piessuie. Tis can only be done by ieveising the pioceduie, i.e. by ist develop-
ing a well-motivated and fully-edged causal model of how a suspected univeisal could have
a systematic impact on language change, and then test the iesulting hypothesis against laige
datasets (in fact, laigei than what l have heie access to foi illustiative puiposes). ln othei woids,
the illustiations can at best be indicative of statistical tiends, and like all statistical tiends, they
may oi may not ieect ieal causal chains.
3.1 A Scenario A example: A-before-P order
A good example of a diiectional family bias (Scenaiio A) is Gieenbeigs Univeisal 1 (Gieenbeig
1veI), the woildwide tiend towaids placing agents befoie patients in simple clauses. To assess
the distiibution of this piopeity l dened a binaiy vaiiable captuiing whethei a language has a
iigid oi at least dominant A-befoie-P oidei in contiast to a language that allows vaiious oideis
oi even favois P-befoie-A oideis. l then applied this to a dataset meiging Diyeis (zccb) v~is
data and data fiom the ~U1o1vv database, coveiing 1,Iz languages in total. lach stock was
loi the ~U1o1vv data, see http://www.uni-leipzig.de/~autotyp. Meiging the data is justied by the fact
that theie aie only I mismatches in the s languages foi which theie is infoimation in both databases. When
theie weie mismatches, l chose the ~U1o1vv coding.
DRAFT Maich 11, zc11
Distributional biases
tested foi whethei it is biased towaids an A-befoie-P oi towaids the opposite (i.e. P-befoie-A
oi exible oidei). A stock counts as biased if theie is a signicant piefeience foi one of the two
options.
Deteimining such biases in stocks suggests that of the v laige stocks in the dataset, Iv (ee)
aie biased towaids an A-befoie-P oidei and only z (I) aie biased towaids the opposite. Te
two opposite biases (Algic and lioquoian) aie both towaids exible oideis, not towaids a iigid
P-befoie-A oidei. Te iemaining stocks (1s, coiiesponding to I1) aie diveise and show no
signicant bias in any diiection (e.g. Caiiban has a mix of languages with exible, A-befoie-P
and P-befoie-A oideis).
Tese fiequencies suggest that theie is a signicant and laige piefeience foi families to be
biased towaids an A-befoie-P oidei (exact binomial test, one-tailed p .cc1, .v). Undei
the assumption of the lamily Bias Teoiy, this would suggest that theie is univeisal piessuie
foi families to keep A-befoie-P oideis if the pioto-language alieady had this, oi to develop such
oideis if the pioto-language did not have such an oidei. Since we found the biases at the level
of stocks, this means that univeisal piessuie must have been stiong enough to aect language
change within the time-depth of stocks (by keeping languages fiom changing away fiom A-
befoie-P oideis and by favoiing changes towaids A-befoie-P oideis). Te minimum stiength
s of the eect can be estimated fiom the piopoition of biased among all families, which is at
least .ev.
3.2 A Scenario B example: coding of property concepts in predicate position
Non-diiectional family bias (Scenaiio B) can be illustiated by the distiibution of how languages
code piedicative adjectives. Stassen (zcc) denes thiee types, veibal (i.e. veib-like), nonveibal
and mixed coding of piopeity concepts in piedicative function. Mixed means that languages
use both stiategies, eithei dieientiated by function oi by lexical classes (e.g. veibal coding is
used to piedicate tempoiaiy piopeities and nonveibal coding is used to piedicate peimanent
oi intiinsic piopeities).
Stassens (zcc) database contains 1s suciently laige families (i.e. families with at least
membeis each). Of these, 1I (z) aie biased in some diiection, and this piopoition exceeds
what one would expect by chance alone (exact binomial test, one-tailed p .c1s). Within biased
families, e families piefei nonveibal, 1 veibal and I mixed coding. Tese piopoitions aie sta-
tistically faiily close to a unifoim (i.e. each) distiibution, and so theie is no evidence foi any
one type being univeisally piefeiied (
z
1.cs, p .ev). Undei the assumption of the lamily
Bias Teoiy such a nding suggests that the way piedicative piopeity concepts aie coded is
genealogically stable at the time depth of the assumed genealogy. Almost thiee quaiteis of the
stocks in the database tend to have a consistent type thioughout the family, suggesting a stiong
bias towaids diachionic ineitia if the pioto-language had a specic type, this tends to suivive
l use binomial tests heie although Poisson (log-lineai) modeling might eventually be moie appiopiiate since it
is plausible to think of family biases as Poisson piocesses, see Cysouw (zc1cb) foi some aiguments foi Poisson
piocesses as undeilying typological distiibutions. None of the iesults iepoited heie depends on the decision, as
p-values aie in the same ballpaik anyway. Note that l use the symbol foi the piobability of an event, in oidei
to avoid confusion with p-values (the piobability of a test statistic undei the null hypothesis).
DRAFT Maich 11, zc11
e Distributional biases
all splits and bianchings. Only about one quaitei of the stocks in the database aie diveise so
that within them, some bianches must have lost oi innovated a type.
Of couise, theie might be additional, e.g. aieal oi stiuctuial factois that favoi one oi the
othei types. liom Stassens (zcc) map it looks like South-last Asia foi example is an aiea with
a veiy stiong piefeience foi the veibal coding type, but it is not cleai wheie the boundaiies of
this aiea would be in this case in one sense, it extends all ovei the Pacic Rim (Nichols 1vvz).
But this is contiadicted by many languages with nonveibal types in South Ameiica, Austialia
and the Papuan iegion. Anyway, as fai as l can see, theie is so fai no statistical signal in any
cleai diiection.
An additional and in fact moie seveie pioblem is that the total numbei of families that have
enough membeis foi estimating biases is ielatively low theie aie only 1s stocks with moie than
membeis each, but ciitical statistical signals could come fiom smallei families and isolates. l
will ietuin to this pioblem and suggest a solution in Section . Befoie this, howevei, l wish to
fuithei discuss and substantiate the coie claims of the lamily Bias theoiy.
4 Evidence for the theory
Te cential claim of the lamily Bias Teoiy is that a diiectional bias acioss families ieects
some diiving factoi (univeisal piessuie, as pei Scenaiio A). Te alteinative to this view would
be to hypothesize that a diiectional bias ieects not some diiving factoi but instead faithful
inheiitance, i.e. extiemely stable distiibutions (as pei Scenaiio B). As a iesult, not only non-
diiectional but also diiectional family biases would ultimately be caused by diachionic ineitia, a
geneial ieluctance to change ovei time. ln the example of the A-befoie-P oidei, this would mean
that most families consistently have A-befoie-P oideis not because this oidei is univeisally
piivileged but because speakeis faithfully copy this oidei fiom theii paiental languages and
most paiental languages just happened to have had A-befoie-P oidei.
Technically, the dieience between these two hypotheses boils down to dieiences in piob-
abilities of change, as in spelled out in (1), wheie the succession symbol iepiesents diachionic
change and F again iepiesents some typological chaiacteiistic
(1) Two possible hypotheses explaining directional family bias:
a. (non-F F) ~ (F non-F)
b. (non-F F) (F non-F) c
One of the points of Maslova (zccca) is that it is neaily impossible to decide between these two
hypotheses. By contiast, the key claimof the lamily Bias Teoiy is that (1b) needs to be iejected
as uniealistic. A similai point was made by Johanna Nichols in hei zccz plenaiy addiess to the
Linguistic Society of Ameiica (Nichols zccz), and in the following l substantiate the aiguments
and evidence foi this.
Let us assume, foi the sake of the aigument, that hypothesis (1b) is coiiect, and that accoid-
ingly, a diiectional family bias in a distiibution D iesults by and laige fiom faithful ieplication
within each family. lf this is so, then the distiibution in the cuiient geneiation D(G
c
) must
iesemble the distiibution in the pievious geneiation, D(G
k+1
). Unless theie was some diiving
factoi befoie G
k+1
, all D(G
k
) must ieect D(G
k+1
) until k spans the entiie histoiy of the human
DRAFT Maich 11, zc11
Distributional biases
language faculty. Ten, D can be said to be supei-stable ovei veiy deep time, and fiom this,
we can piedict that changes in D aie all the moie unlikely within shoit time inteivals. Now,
all ieconstiuctible time inteivals aie ielatively shoit up to about e-s,ccc yeais, the age of
demonstiable families. Teiefoie, if a vaiiable is supei-stable, we except to be able to obseive
almost no changes in the known histoiy of D(G
c
). As a iesult, most obseivable families should
be unifoim since each case of a non-unifoim family necessaiily iepiesents at least one case of
change. Given this, the empiiical question is whethei one can obseive moie cases of change in
D(G
c
) than what would be expected if D(G
c
) is the sole iesult of faithful ieplication, dened as
a small piobability of change in (1). Te moie we obseive cases of change beyond what small
values of allow, the less such values become, and this would disfavoi Hypothesis (1b).
To nd out, l ist computed the minimumnumbei of changes auested in each known family
foi a laige set vaiiables. Tis coiiesponds to the numbei of unique values (types, levels) in
each family, minus 1 if a stock has two dieient values in one vaiiable, theie must have been
at least one case of change (iegaidless of how the tokens aie actually distiibuted, e.g. as 1v oi
a iatio), e.g. fiom A to B oi vice-veisa. lf a family has thiee dieient values, theie must
have been at least two cases of change, e.g. fiom A to B and to C, and so on. Teie could
of couise always have been moie cases of change (in paiallel oi in sequence), but the logical
minimum of obseivable changes equals the numbei of unique values minus one
(z) min(C
F
) = k
F
1,
wheie C
F
iepiesents changes in vaiiable F and k
F
the numbei of levels (types) of F.
l computed this minimum foi a total of Ise vaiiables taken fiom the ~U1o1vv and v~is
databases, iequiiing that the vaiiable is coded foi at least 1c families that each aie iepiesented
by at least two membeis (since isolates oi families iepiesented by a single membei do not
allow counting cases of change). Te vaiiables aie of vaiious kinds coveiing almost all paits of
giammai and phonology, and they include many alteinative ways of coding, e.g. both a binaiy
and a e-way bieakdown of basic woid oidei, and vaiious othei veisions of this. loi cuiient
puiposes, l tieat scalai vaiiables (e.g. on the size of the vowel inventoiy) in the same way as
categoiical ones. Tis is justied by the fact that fiom the point of view of diachiony, a change
fiom one point on a scale to anothei, e.g. fiom to e vowels, is as disciete a change as, say, the
development of a tone opposition fiom laiyngeal seuing contiasts.
l then tested foi each vaiiable whethei the obseived minimum numbei of changes, i.e.
min(C
F
), exceeds what can be expected undei the assumption of a given piobability of change
, leuing iepiesent vaiious assumptions ianging fiom c to 1 (at inciements of .c1),
and assuming, foi the sake of the aigument, that the cuiient distiibutions aie the sole iesult
of faithful iepiesent (as pei Hypothesis 1b). As a ciiteiion foi what qualies as an unexpected
excess of min(C
F
) undei a given value of , l use a .c iejection level of the null hypothesis
that the obseived piopoition of min(C
F
) does not exceed in a one-sided binomial test. lf
the obseived piopoition signicantly exceeds what is expected undei a given value of , this
means that the actual piobability of change must be highei than .
And note that counting the minimum in this way favois the hypothesis in (1b) because it systematically undei-
estimates the piobability of change .
DRAFT Maich 11, zc11
s Distributional biases
With binaiy vaiiables, the obseived piopoition of min(C
F
) can be diiectly computed by
dividing min(C
F
) by the numbei of families in the database, since each family coiiesponds to
at least one oppoitunity foi the vaiiable to change, e.g. fiom type A to type B oi vice-veisa
(always limiting oui auention, as befoie, to the minimum numbei of changes that is logically
possible). loi example, with .1, it is unexpected (undei a binomial test) to nd a minimum
of zc cases of change in c families if the vaiiable is binaiy. But if the vaiiable denes thiee
instead of two values, each family allows foi at least two possible changes fiom A to B oi
to C, and then it is no longei unexpected to nd at least zc cases of change. ln geneial, foi a
vaiiable that denes k types, the (minimum) numbei of oppoitunities foi change is
(I) min(O
F
) = (k
F
1) N(families)
Teiefoie, l tested whethei the piopoition
min(C
F
)
min(O
F
)
is expected undei a given piobability .
Te iesult of these tests foi the Ise vaiiables is summaiized in liguie 1. Te assumed values
of only ieach a moie substantial match with obseived numbeis of changes if ~ .1c, staiting
with a piopoition of z, and they only ieach full coveiage if .s. Tis is fai above what
Hypothesis (1b) allows foi.
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Assumed probability of change
P
r
o
p
o
r
t
i
o
n

o
f

v
a
r
i
a
b
l
e
s

w
i
t
h

e
x
p
e
c
t
e
d


m
i
n
C
F
m
i
n
O
F
liguie 1 Piopoition of vaiiables foi which the obseived minimum numbeis of change is
statistically expected undei assumed piobabilities of change .
Howevei, (1b) is dicult to maintain even foi those vaiiables with lownumbeis of obseived
changes and that aie theiefoie compatible with the assumption of lowvalues of . Tis becomes
evident in liguie z, which plots the mean entiopies of those vaiiables foi which the obseived
numbeis of changes is statistically expected undei a given value of . lntiopies (designated
H) aie a standaid estimate of the extent to which the distiibution of values in a vaiiable is
DRAFT Maich 11, zc11
Distributional biases v
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
1
.
2
1
.
4
Assumed probability of change
M
e
a
n

e
n
t
r
o
p
y

o
f

v
a
r
i
a
b
l
e
s

w
i
t
h
e
x
p
e
c
t
e
d

n
u
m
b
e
r
s

o
f

c
h
a
n
g
e
s
liguie z Mean entiopies of those vaiiables foi which the obseived minimum numbeis of
change is statistically expected undei a given piobability of change .
biased (low entiopy) iathei than unifoim (high entiopy). As indicated by the giey doued line,
the mean entiopies ieach theii oveiall chaiacteiistic values (with mean 1.1I) only with
~ .1c, i.e. only with vaiiables foi which the obseived numbei of changes is expected undei
assumed piobabilities highei than .1c. With .1c, vaiiables tend to have consideiably lowei
entiopies, ieecting stiong biases towaids one value. ln othei woids, in this iange of theie
aie, on aveiage, only veiy few vaiiables with moie balanced distiibutions (such as the coding
of piopeity concepts in piedicative function, as ieviewed in Section I).
Stiong biases (low entiopies) aie chaiacteiistic of rara vs. universalia oppositions. Table 1
illustiates this foi those vaiiables foi which the expected numbei of changes is coveied at
.c1, i.e. at a value of that would be closest to what is hypothesized in (1b). Te piesence of
have-peifects and of tonal case aie well-known aieal rara in luiope and Afiica, iespectively,
and both have ielatively shallow histoiies i.e. the exact opposite of what (1b) would piedict.
All othei vaiiables ieect veiy stiong univeisal piessuie in favoi of some featuie (independent
subject pionouns, inteiiogative/declaiative distinctions) oi against some featuie (stem exivity
conditioned by negation maikeis, oi vaiious co-exponence types of such maikeis). Tis is fully
in line with the hypothesis in (1a) foi example, it seems much moie common to develop and
maintain an inteiiogative/declaiative distinction than to loose it. But it is dicult to explain if
loimally, the (Shannon) entiopy H of a vaiiable V with levels v
i
v
1...k
and associated piobabilities
v
i
is
H(V) =

k
i=1

v
i
log
z
(
v
i
). H(V) is zeio if theie is a maximum bias towaids a single level, e.g. with
v
1
= 1,

v
z
= c, and
v
I
= c, H(V) ieaches its maximumin unifoimdistiibutions, e.g. with
v
1
=
1
I
,
v
z
=
1
I
, and
v
I
=
1
I
.
l estimate using the Maximum Likelihood method, i.e. fiom the empiiical fiequencies.
Tis has been noted by Bickel & Nichols (zcc) foi the co-exponence of othei categoiies as well.
DRAFT Maich 11, zc11
1c Distributional biases
the development in eithei diiection has a veiy low piobability (as 1b would piedict). loi othei
vaiiables in the iange .c1 .1c, the pictuie is similai to what is illustiated by Table 1 they
tend to be heavily biased and ieect a rara vs. universalia distiibution. Such biases aie likely
to iesult fiom stiong aieal diusion oi univeisal piessuie so stiong in fact that the ielevant
choice is likely to establish itself veiy quickly, and that once the choice is made, languages
iefiain fiom undoing it and families look almost completely homogenous. Tis ieects the
scenaiio hypothesized in (1a) and is not consistent with (1b). Tus, iathei than suggesting
faithful ieplication, extiemely low numbeis of known changes seem to point to veiy stiong
eects of some diiving factoi (pace Paikvall zccs, Wichmann & Holman zccv, oi Bakkei et al.
zccv).
Vaiiable Changes Oppoitunities lntiopy Ratio
(and data souice) min(C
F
) min(O
F
) of values
lnteiiog./decl. distinction (Diyei zccc) 1 sv c.c1 s111
lndep. subject pionouns (Daniel zcc) c I1 c.c zsz
Tonal case (~U1o1vv and Diyei zccd) I v1 c.c evse
Stem exivity condit. by NlG (~U1o1vv) c 1c c.1z 11111
Have-peifect (Dahl & Velupillai zcc) 1 1 c.I 1c1
Co-exponent type of NlG (~U1o1vv) 1 zI1 c.ec 1sI1111111
Table 1 Vaiiables foi which the minimum numbei of changes does not exceed what is
expected undei .c1 (in incieasing oidei of entiopy)
Teie is one fuithei piece of evidence against Hypothesis (1b) alieady with .c and cei-
tainly with .1c, it is viitually impossible foi typological distiibutions to peisist ovei deep
time in such a way that what one obseives now is similai to what was theie many geneiations
ago. Tis can be shown by computei simulations. l set up datasets with 1,Icc fake languages
(appioximating the size of the laigest available ieal databases) with fake codings foi a binaiy
typological vaiiable. Te codings iepiesent distiibutions ianging fiom 1vv to zcsc to
1cec. lach such distiibution was then sent thiough a numbei of geneiations. ln each gen-
eiation theie was a ceitain piobability thieshold (ianging fiom .c1 to .1c, at inciements
of .c1) below which a iandom subset of languages would change fiom one state to the othei,
with no piefeiied diiection of change. Choosing iandom subsets below iathei than at is
motivated by the assumption that language change is constiained by maximum piobabilities
but does not opeiate at a constant iate, in addition, the method favois Hypothesis (1b) since
change does not always opeiate at full speed as it weie. Afei the distiibutions went thiough
all geneiations, l tested whethei the initial distiibution was still detectable using a two-sided
binomial test. Tis pioceduie was iepeated 1,ccc times, allowing to compute the piopoition of
simulations in which the oiiginal distiibution was still detectable, and fiom this an estimate of
the oveiall piobability of successful detection.
liguie Ia iepoits the iesults foi 1cc geneiations, and liguie Ib foi c geneiations. lf we
assume an aveiage lifespan of languages of about 1,ccc yeais, 1cc geneiations ieect a low
estimate of the age of human language, i.e. a time when majoi innovations that aie likely to
depend on language use, such as oinamentation, pigment piocessing, and long-distance tiading,
DRAFT Maich 11, zc11
Distributional biases 11
become well auested in the aicheological iecoid (McBieaity & Biooks zccc). c geneiations
ieects an uniealistically low estimate, viz. a time when modein symbolic behavioi has spiead
even well outside Afiica.
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Maximum probability of random change
P
r
o
p
o
r
t
i
o
n

d
i
s
t
r
i
b
u
t
i
o
n
s
(a) Afei 1cc geneiations of change
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Maximum probability of random change
(b) Afei c geneiations of change
liguie I Piopoition of 1,ccc simulated distiibutions that aie still detected afei 1cc (a) oi c
(b) geneiations of iandom change at given maximum piobabilities (solid line initial
distiibution 1vv, dashed line zcsc, doued line 1cec, giey line .c piobability
thieshold of detecting a an initial distiibution)
Te ndings suggest that alieady at .c, the piobability of detecting even the most
heavily biased initial distiibution (1vv, ploued as a solid line) staits to no longei exceed
.c afei 1cc geneiations (as indicated by the giey hoiizontal line in the guie), even undei
the shoitei scenaiio of c geneiations, the most heavily biased distiibution ieaches the .c
piobability thieshold just befoie .1c, the level that was noted above as the minimumat which
an appieciable piopoition of vaiiables (about z) begins to show numbeis of change that aie
statistically expected. Tus, even undei an uniealistically shoit life span of human language,
the minimum values of that begin to be consistent with the numbei of known changes is fai
too high foi allowing the long-time peisistence of typological distiibutions iequiied by (1b). ln
othei woids, iealistic values of aie so high that no synchionic distiibution can be accounted
foi by faithful ieplication ovei deep time.
To summaiize, theie aie thiee pieces of evidence against the hypothesis of extiemely low
piobabilities of change in typological vaiiables (i.e. Hypothesis 1b) ist, we tend to nd many
moie cases of change than what extiemely low piobabilities of change would lead us to expect.
Second, those few vaiiables foi which extiemely low piobabilities of change would in piin-
ciple match the obseived numbei of changes, tend to display an extieme rara vs. universalia
distiibution, and this ts beuei with veiy unequal piobabilities of change (1a) than with equal
piobabilities (1b). Tiid, even if piobabilities of change weie as low as .1c, they would still be
too high foi typological distiibutions to iemain stable ovei the entiie lifespan of the human
DRAFT Maich 11, zc11
1z Distributional biases
language faculty.
Taken togethei, these thiee pieces of evidence make the hypothesis in (1b) an unlikely ex-
planation of diiectional family biases. Tis suppoits the alteinative in (1a) and theieby the
coie claim of the lamily Bias Teoiy that diiectional family biases ieect the piessuie of some
diiving factoi and that high degiees of genealogical stability can only explain non-diiectional
family bias but not also diiectional family bias.
5 e problem of small families
Te lamily Bias Teoiy piovides a systematic diachionic inteipietation of synchionic typo-
logical distiibutions. Howevei, like all diachionic inteipietations it has a natuial limit when
confionted with small families it is dicult to estimate a bias oi ieconstiuct foims if one
knows only, say, two oi thiee membeis. lt becomes almost impossible if one only knows a sin-
gle membei. Tis pioblemis substantial because even foi laige databases, such as the genealogy
databases in ~U1o1vv (N z,esc, Nichols & Bickel zccv) oi in the World Atlas of Language Struc-
tures (N z,11, Haspelmath et al. zcc), about half of the stocks aie iepiesented by only one
membei (and this is so iegaidless of which of the two taxonomies is applied and even when one
excludes, as l do heie, cieoles and sign languages, which could all be analyzed as single-membei
families in theii own iight since they iepiesent the biith of new families).
Te pioblem has both epistemological and statistical consequences. lpistemologically, the
pioblem has the eect that many vaiiables of typological inteiest cannot ieally be investigated
when these vaiiables happen to be best iepiesented in less-well documented oi isolated families.
Statistically, the pioblem is one of powei in detecting signals any limitation to laige families
seveiely ieduces the size of datasets, and this has the eect that statistical tests do no longei
have enough powei to detect signals.
ln some sense, one could say that these aie just the ieal limits on what one can possibly
know about diachiony and the histoiical foices shaping typological distiibutions. Howevei, to
the extent that one can tiust the iesults fiom examining biases in laige families, it is possible to
extiapolate fiom laige to small families i.e., foi concieteness sake, fiom families with at least
membeis (cf. Section I) to families with less than membeis. Tis iequiies two assumptions
(1) a. Normal Diarony Assumption:
Te membeis of small families aie the sole suivivois of laigei families.
b. Uniform Development Assumption:
Unknown families aie subject to the same developmental piinciples as known fam-
ilies.
Te iationale behind the Noimal Diachiony Assumption is the following if we dont know
what othei languages a language oi a small gioup of languages is ielated to, i.e. if we aie
dealing with an isolate oi a small gioup, this is only an epistemological issue, not an ontological
one (a point also emphasized by Maslova (zcccb)). Ontologically, isolates and small gioups
aie still membeis of laigei families, its only that we dont know them because they became
extinguished, in most cases because theii speakeis shifed to othei, unielated languages. Tis
assumption is motivated by the fact that just like any othei language, isolates and small families
DRAFT Maich 11, zc11
Distributional biases 1I
must come fiom somewheie, i.e. they aie the iesult of noimal diachionic tiansmission (in the
sense of Tomason & Kaufman 1vss). Obviously, the assumption does not hold foi most cieoles
and sign languages because they did not aiise fiom noimal diachionic tiansmission, and it
would indeed be incoiiect to extiapolate insights fiom known laige families to the genesis of
cieoles and sign languages. (As a iesult, cieoles and sign languages piovide a dieient window
on univeisal piessuie shaping language than languages with a noimal diachiony behind them
and ieseaich on this iequiies othei methods than what l discuss heie.)
Te Unifoim Development Assumption assumption is again based on the insight that the
status of languages as isolates oi membeis of small gioups is an epistemological and not an
ontological fact the extent to which we know genealogical ielationship of a gioup has no
piincipled consequences on the kinds of diachionic developments that the gioup went thiough.
loi example, just because we dont know any sistei languages of Basque does not mean that
the kind of diachionic piocesses that iesulted in modein Basque was iadically dieient fiom
the kind of piocesses that iesulted in the development of Hindi fiom Pioto-lndo-luiopean
we expect the same kind of complex mix of spontaneous (iandom) change, contact eects and
univeisal piessuie.
Taken togethei, the two assumptions in (1) allow us to extiapolate fiom laige to small fam-
ilies. loi this, we ist compute the piopoition of biased vs. diveise families in a suivey of laige
families and then use this piopoition as an estimate of the extent to which small families aie
biased. loi example, in the suivey of A-befoie-P oideis in Section I we found that ev laige
families aie biased in some way and I1 aie diveise. Based on (1), we can now make the as-
sumption that the extent to which families diveisify theii oideiing of A and P aiguments, and,
conveisely, the extent to which families keep whatevei oidei they have, does not only hold foi
laige families but also foi small families. ln othei woids, we assume that oui extiemely ieduced
knowledge of Basques ancestiy has no consequences on howBasque developed (viz. by noimal
diachionic tiansmission, as pei 1a) and to what extent the language was aected by univeisal
piessuie in language change (as pei 1b). Teiefoie, we assume that about c of small families
aie the sole suivivois of laige families with a bias and about Ic of small families aie the sole
suivivois of laige families without a bias, i.e. to be diveise.
Teie is one piobabilistic detail that we need to take caie of befoie pioceeding fuithei,
howevei if we happen to nd 1cc laige families to be biased (in whatevei diiection), it would
not be coiiect to estimate a piobability of 1 that small families aie biased as well, i.e. theie
cannot be absolute ceitainty that all small families aie biased and it is always possible that
they iepiesent laigei diveise families. A well-established way of avoiding this is by estimating
piobabilities using Laplaces Rule of Succession if k out of n laige families aie biased, we
estimate the piobability of small families to be biased as
k+1
n+z
. ln oui example of A-befoie-P
oideis, this would be
11+1
v+z
.esv. Tis is veiy close to the estimate based on the iawpiopoitions
(.ev) but foi smallei samples, the dieience can be moie substantial if we had obseived 1c
out 1c families, the estimate would not be (biased) 1, but (biased)
11
1z
.vz. Te key idea
behind the foimula is this the a priori assumption that families can in piinciple be eithei biased
oi diveise is equivalent to having obseived one biased and one diveise family, and these as if
DRAFT Maich 11, zc11
11 Distributional biases
obseivations aie added to the obseived fiequencies.
Using the estimated piobability of being biased, we then iandomly declaie a coiiesponding
piopoition of small families to be the solve suivivois of biased families. ln oui example of A-
befoie-P oideiing, we would declaie a iandom selection of ev small families to be biased. Te
iemaining small families (I1) aie declaied to be the sole suivivois of diveise laigei families.
Howevei, by viitue of being statistical estimates, biases aie giadual and allow deviations foi
example, one of the laige families in the suivey, Austionesian, is signicantly biased towaids
A-befoie-P oideiing. Despite this bias, zv out of the 11 (zc) iepiesentatives of the family in
the database deviate fiom this, paitly by having VPA (VOS) oidei (such as Kiiibatese), paitly
by having vaiiable woid oidei (such as Acehnese). When assuming that a small family is the
sole suivivoi of a biased family, the question theiefoie aiises to what extent the small gioup
we know iepiesents the oveiall bias of the family, oi deviates fiom this tiend, just like the zc
of Austionesian languages that deviate fiom the oveiall tiend in Austionesian.
ln iesponse to this, we ist estimate the piobability that a small gioup iepiesents the family
bias fiom the extent to which the bias is found in laige families. As noted, in Austionesian this
extent is .sc, in othei laige families (e.g. Diavidian) it is 1. Laige families can of couise be biased
in the opposite diiection. ln oui example, we obseived this foi Algic and lioquoian, and heie
the bias (against A-befoie-P) is in each case complete (i.e. 1) in the database. Taken all these bias
estimates togethei suggests that, on aveiage, if a laige family is biased on the aigument oidei
vaiiable in whichevei diiection (A-befoie-P oi the opposite), it is so biased to v., and theie
aie on aveiage 1. deviates inside the family. (Austionesian, with as many as zc deviates, is
theiefoie quite exceptional.) When estimating the piobability having a bias vs. being diveise,
we coiiected these estimates by the Laplace Rule of Succession because a priori it is always
possible foi families to be biased to some degiee oi to be diveise. loi estimating the deviation
piobability, howevei, l suggest to iely on the baie piopoitions, i.e. if all biased families aie
completely biased, with no deviations, l suggest to assume a geneial deviation piobability of c.
Te ieason is as follows. Postulating deviations is the same as postulating instances of language
change (unlike postulating a bias, which may oi may not imply language change, depending
on the extent of the bias). Now, fiom geneial paisimony constiaints on histoiical linguistics
(Occams Razoi), it follows that one postulates language change only in the piesence of positive
evidence. Teiefoie, a piioii i.e. unless theie is any evidence to the contiaiy we assume
that an isolated language oi small language gioup iepiesents its ancestois faithfully, with no
change, no deviation.
Given these consideiations, we can estimate the piobability to which the membeis of a
what we estimate is a biased small family iepiesent indeed the family bias (heie, v.) and
the piobability to which membeis aie likely to be deviating exceptions (heie, 1.). Based on
this, we iandomly declaie some piopoition of the estimated biased families to be obseived
with iepiesentative membeis and some piopoition to be obseived with deviating membeis. ln
those small families wheie membeis aie estimated to iepiesent theii family bias, we declaie the
lt is a mauei of fuithei ieseaich to establish whethei
1
z
is indeed an appiopiiate paiametei value of the a priori
bias piobability heie.
Technically, this is done via a iandomly geneiated binomial distiibution with the estimated bias piobability.
DRAFT Maich 11, zc11
Distributional biases 1
family to biased towaids whatevei happens to be its sole type oi what appeais to be its most
likely type given the geneial bias estimate and the fiequency distiibution within the family.
loi example, a small family will be declaied to be biased towaids A-befoie-P oidei if all oi
most of the small gioup have A-befoie-P oidei, if theie is a tie (e.g. two languages with A-
befoie-P and two languages with othei oideis), we iandomly pick one as iepiesentative. ln
the small families wheie membeis aie estimated to iepiesent deviating exceptions, we declaie
them as suivivois of a family that had a bias in an alteinative diiection (iandomly chosen but
weighted by the piobability of diiections given by the geneial bias estimate and the fiequency
distiibution within the family). loi example, if all oi most languages in the small family have
A-befoie-P oidei (oi if indeed theie is only a single language and it happens to have A-befoie-P
oidei), we estimate that these languages come fiom a laigei family with the opposite bias (i.e.
no A-befoie-P oidei), if theie is a tie, we again iandomly select one of them.
ln the oveiall extiapolation piocess theie aie thiee situation wheie we make iandom selec-
tions ist, when declaiing a piopoition of small families to be the sole suivivoi of diveise vs.
biased families, second, when bieaking ties foi deteimining what kind of bias a small biased
family iepiesents, and thiid, when assigning an alteinative type to those small biased families
that we estimate as iepiesenting deviating exceptions within laigei families. Tese iandom
choices induce statistical eiioi but because the eiioi is iandom, it can be assumed to be noi-
mally distiibuted. Teiefoie, we can peifoim the extiapolations with all iandom selections
many times (say, zccc oi 1c,ccc times) and then compute the mean of all extiapolation iesults.
loi example, a single extiapolation might suggest 11 small families with an A-befoie-P
bias, zv with the opposite bias and ev to be diveise, the next extiapolation might suggest 11e
cases of A-befoie-P bias, I1 opposite biases and e diveise families etc. lf we take the mean
of these fiequencies ovei zccc extiapolations, we aiiive at estimated fiequencies of 11.c A-
befoie-P bias, II.cv opposite bias and ee.sI diveise. Added to the estimates fiom laige families,
this iesults in an oveiall estimate of 11.c A-befoie-P biases, I.cv opposite biases and s1.sI
diveise. Tis conims the iesult fiom Section I that theie is a signicant tiend foi families to
be biased towaids A-befoie-P oidei as against P-befoie-A oi fiee oideis (exact binomial test, p
.cc1, .sz).
ln Section z l dened the stiength of the univeisal piessuie by the piopoition of biased as
opposed to diveise families. Since when extiapolating to small families, we use this piopoition
foi estimating to what extent small families aie the sole suivivois of families with a bias, we
can no longei ie-compute this piopoition fiom the extiapolation iesults. ln othei woids, as fai
as l can see, estimates of the stiength of univeisals can only be taken fiom laige families. Te
estimate of the stiength can theiefoie be dened as the Laplace estimatoi of biases discussed
above, i.e.
() s =
k + 1
n + z
,
wheie k is the numbei of biased families out of a total of n families. Note that because of this
equality, extiapolations will not be of help when testing foi what l called Scenaiio B (non-
A ieady-to-use function foi computing family biases in this way is available in an R package wiiuen by Taias
Zakhaiko at http://www.uni-leipzig.de/~autotyp/familybias.R.
DRAFT Maich 11, zc11
1e Distributional biases
diiectional family bias) in Section z the piopoition of biased families is by denition the same
befoie and afei the extiapolation. What can usefully be done, howevei, is to examine whethei
the bias is still undiiected afei extiapolation to small families.
Applying the extiapolation method to the non-diiectional family bias of piedicative adjec-
tive encoding (cf. Section I) suggests that this is the case. ln Section I we found no statistical
piefeience foi any type. Afei extiapolation, we can estimate I1.eI families to be biased to-
waids veibal, Ic.1s towaids nonveibal and z.ev towaids mixed encoding. Tese fiequency
estimates aie still not signicantly dieient fiom what one would expect undei a unifoim (
each) distiibution (
z
1.I1, p .1). Tis suggests that the encoding of piopeity teims in
piedicative function is a genealogically stable piopeity, not subject to any known univeisal oi
laige-scale aieal piessuie.
ln the two examples ieviewed so fai, the iesults of signicance tests weie not aected by the
numbei of datapoints befoie vs. afei extiapolation to isolates and small families. Howevei, this
can be quite dieient because (a) isolates and small families may biing in ciitical evidence and
(b) because a statistical test may only be able to detect a signal if the dataset ieaches a ceitain
minimal size. Tis can be exemplied by examining the distiibution of (some kind of) gendei
in independent pionouns, based on Siewieiskas (zcc) WALS dataset (N Is1, afei adding a
few data to which l had easy access in oidei to inciease the numbei of stocks with moie than
ve membeis'). Without extiapolations, the data fiom laige families suggests non-diiectional
family bias of 1 laige families, 1z have a bias, of these, 1 aie biased towaids having gendei
and s against. A piopoition of 1z families out of 1 to be biased is boideiline signicant undei
a binomial test (p .c, .1). But the 1s iatio in the diiection of the bias does not suggest a
signicant piefeience (p .1v1, .ee), and so the data seemto suggest Scenaiio B fiomSection
z daughtei languages seem to maintain whatevei the pioto-language was like if it had gendei,
gendei is pieseived, if it didnt have gendei, it doesnt develop it. Howevei, the absence of a
statistical signal could also just ieect the fact that the total numbei of laige families is veiy
small (N s) and statistical tests dont have enough powei to detect tiends. ln addition, theie
could be a possible diiection in the bias specically among small families and isolates. To nd
out, we can use the piopoition of biases among laige families (1z out of 1) and extiapolate to
small families and isolates.
Te mean extiapolations suggests that 1z.z1 (z1) families aie biased towaids and sc.
(1) against having gendei distinctions in independent pionouns. Tis dieience matches the
1s iatio among the laige families, but the laigei numbei now allows detecting a statistically
signicant signal (p .cc1, .ee). Te iesult seems to suggest a univeisal bias against gendei
in pionouns.
Howevei, in this case the extiapolations aie based on only 1e families, and the iesult should
not be taken as establishing a woildwide tiend against gendei. Te only way to put the iesults
on imei giounds is to develop databases that collect moie datapoints pei family and theieby
puisue a data-collection stiategy that is the exact opposite of how most typologists have col-
lected data in the past.
At any iate, as tentative as they aie, the iesults on pionominal gendei t with Nicholss
' l added Tobelo, Galela, and Somali as languages with pionominal gendei.
DRAFT Maich 11, zc11
Distributional biases 1
(1vvz, zccI) hypothesis that gendei in geneial is disfavoied univeisally. lt does not seem to be
paiticulaily pione to inheiitance unless theie is suppoit fiom neighboiing families that have
gendei (like in luiope oi Afiica). Suppoit fiom neighboiing languages is an issue of aieal
conditions deteimining family biases, which is one of the topics in the following section.
6 Extending the Family Bias approa to multivariate distributions
ln the pieceding l have limited my auention to the distiibution of a single vaiiable. But the
distiibution of one vaiiable may be conditioned by othei vaiiables, foi example othei stiuctuial
vaiiables (such as woid oidei) oi aieal oi social vaiiables (such as Spiachbund membeiship oi
the piesence of ceitain kinship systems).'' l subsume all kinds of conditional eects undei the
teim conditional piessuie.
Just like in univaiiate designs, conditional piessuie can be stiong oi weak, and the lowei
bounds of this stiength can be estimated by the piopoition of biased vs. diveise families. Unlike
in univaiiate designs, howevei, these stiengths need not be unifoim but can diei acioss condi-
tions. loi stiuctuial factois, piessuie stiength can be expected to be unifoim acioss conditions
in the case of bi-diiectional univeisals. An example is the classical hypothesis that OV stiuc-
tuies favoi postpositions and, conveisely, that VO stiuctuies favoi piepositions (Gieenbeig
1veI, Diyei 1vvz). ln the case of uni-diiectional univeisals (e.g. post-nominal ielative clauses
undei non-veib-nal woid oidei conditions, but no tiend towaids pie-nominal ielative clauses
undei veib-nal conditions), one expects stiong piessuie in one condition (heie, undei the
non-veib-nal condition) but undei the othei condition, theie can be many diveise families, oi
families can be biased in iandomways. Te univeisals is suppoited as long as the tiend towaids
biases is stiongei undei one than undei the othei condition.
loi aieal factois, such dieiences aie in fact expected when compaiing the distiibution of
featuies inside vs. outside an aiea, one expects stiong piessuie towaids some featuie F (less
diveisity) inside the aiea but only weak piessuie against F (moie diveisity) outside the aiea.
Aieal diusion leads to the widespiead adoption of F, iesulting in an incieased fiequency of
F. Tis is in contiast to the woild outside the aiea, wheie nothing is suspected to aect the
distiibution of F it can tend to be diveise within families oi families can be biased in iandom
ways. All that maueis foi aieality is that theie is signicantly highei piopoition of families
with an F-bias inside than outside the aiea.
Testing bivaiiate hypotheses like these is complicated by the fact that the ielevant condi-
tion may not hold foi entiie families foi example, stocks like lndo-luiopean oi Sino-Tibetan
contain both VO and OV bianches. A solution to this pioblem comes fiom the fact that the
lamily Bias Teoiy makes no assumptions about the taxonomic level oi time-depth at which
biases can be found, it is not even iequiied that biases aie always found on the same taxonomic
level acioss families (cf. Section z). Tis suggests that family biases can be estimated in what-
evei is the highest taxonomic level at which subgioups aie not split with iegaid to the ielevant
condition. ln lndo-luiopean foi example, one can estimate biases within OV and VO bianches.
'' oi many such vaiiables togethei and inteiacting with each othei. Heie l concentiate on simple cases. loi a
discussion of inteiactions between conditioning vaiiables, see Bickel (zccsa) and Cysouw (zc1ca).
DRAFT Maich 11, zc11
1s Distributional biases
Teie is one fuithei complication, though given the ofen sketchy knowledge that is avail-
able on subgiouping, it is ofen impossible to nd plausible subgioups, oi, even if the taxonomy
is well established, subgioups may be diveise with iegaid to some condition of inteiest. ln both
these cases, l piopose to posit pseudo-gioups, based on the dieience in the ielevant condi-
tions, e.g. an OV pseudo-gioup vs. a VO pseudo-gioup. lmpoitantly, these pseudo-gioups aie
posited solely foi the puiposes of testing whethei dieiences in the condition have an eect on
family biases. Tey aie not evidence foi ieal subgioups because changes in typological piopei-
ties can be due to factois that aie entiiely dieient fiom the kind of aibitiaiy and idiosynciatic
innovations that dene genealogical tiees. Howevei, since some change must have split the
family, it is a legitimate isogloss foi testing puiposes undei the lamily Bias Teoiy, the ques-
tion is only whethei the isogloss is associated with dieient iesponses to such an extent that
the pseudo-gioups aie now biased in a piedictable diiection.
ln the following l ist exemplify this appioach with hypotheses on stiuctuial and then on
aieal factois.
6.1 Example 1: relative clause position and word order
Te ist example conceins the hypothesis that the odds foi ielative clauses to be post-nominal
aie highei undei non-veib-nal than undei veib-nal conditions (Gieenbeig 1veI, Diyei 1vvz,
Hawkins 1vv1, among otheis). To examine this hypothesis, l combined Diyeis (zcca) v~is
dataset on ielative clause position with his dataset on dominant main clause veib positions,
excluding languages with exible oideis (Diyei zccb), but adding moie ciitical data on Sinitic
(Yue zccI). Both small and laige families can be homogenous oi split on the ielevant condition,
i.e. can contain both non-veib-nal and veib-nal languages. ln the dataset (N 1I languages),
theie aie zz laige stocks (i.e. with at least membeis) and ve small families. Of the zz stocks,
11 (oi e1) aie homogeneously veib-nal oi non-veib-nal. ln 1 (1s) stocks (lndo-luiopean,
Sino-Tibetan, Cushitic, and Austioasiatic), homogenous bianches can be found at the majoi
bianch level.
ln some cases, deteimining family biases at lowei levels leaves small families oi even single
languages stianded as the sole iepiesentatives of theii bianch, e.g. Albanian and Gieek in lndo-
luiopean, oi Lolo-Buimese (with I iepiesentatives), two Naga languages and few otheis in
Sino-Tibetan. ln some cases, an entiie bianch ends up with single-membei gioups the Westein
Oceanic gioup of Austionesian, foi example, is iepiesented in the database by Tolai and Tawala.
Since the two languages diei in basic woid oidei, they aie assumed heie to iepiesent theii own
single-membei gioups.
ln I of the zz stocks (1I), homogenous gioups can be found only by positing pseudo-
subgioups. One example is Uto-Aztecan. While the Noithein bianch is homogeneously veib-
nal, and the Aztecan gioup of the Southein bianch is homogeneously non-veib-nal, the
Sonoian gioup of the Southein bianch is mixed. Teie aie two non-veib-nal and ve veib-
nal languages and the distinction does not match any subgiouping iepiesented in the ~U1o1vv
taxonomy assumed heie (although it might of couise t othei possible subgiouping hypothe-
ses). ln this case, l posit two pseudo-subgioups foi computing family biases, a non-veib-nal
one and a veib-nal one. Anothei example is found in the Bantu bianch of Benue-Congo
all but one Bantu language in the database aie non-veib-nal. Heie l posit a laige non-veib-
DRAFT Maich 11, zc11
Distributional biases 1v
nal non-nal Sum
diveise z c z
Rel-N e 1
N-Rel 1 1 1s
Sum v 1s z
(a) laige families only
nal non-nal Sum
diveise zz.v .sc zs.v
Rel-N zv.I 1.v I1.es
N-Rel Iz.1s 1z.z 1v.I
Sum s.cc 1I.cc zzc.cc
(b) with extiapolation to small families
Table z lamily biases in ielative clause position dependent on main clause veib position
nal pseudo-gioup and a small veib-nal pseudo-gioup which is iepiesented only by a single
language in the database (viz. Tunen Mous 1vv). Te thiid case wheie pseudo-gioups aie
necessaiy is Aiawakan. Tis stock is iepiesented in the database with only single iepiesenta-
tives fiom each bianch, with ve non-veib-nal and one veib-nal bianch. Heie l assume a
non-veib-nal pseudo-gioup with ve membeis and a small single-membei gioup.'
With this, we aiiive at a total of z laige families, including I pseudo-gioups. Of the z
families, 11 (z) aie at the stock level, v (II) at the highest (majoi) bianch level, and 1 (1)
at lowei levels. Table za cioss-tabulates family biases against main clause veib oidei. Te hy-
pothesis is that biases towaids N-Rel (Noun-Relative Clause) sequences aie much moie likely in
non-veib-nal than in veib-nal families. Tis can be tested with a lishei lxact Test compaiing
the odds foi families with N-Rel biases against families with the opposite bias undei veib-nal
vs. non-veib-nal conditions. Te iesult suggests a signicant and stiong eect (one-sided p
.cc1, estimated odds iatio'

e1.e1). Tis is obviously caused by the fact that only one non-
veib-nal family in the database (viz. Sinitic) is biased towaids pie-nominal ielative clauses.
Te stiength of the univeisal can be estimated by the Laplace estimatoi of the piobability of
biases (cf. ), suggesting a piessuie of .v undei the ciitical condition of non-veib-nal oidei,
i.e. a faiily stiong univeisal.
Undei the othei condition, veib-nal oidei, the bias piobability is .I. Tis suggests
that undei veib-nal conditions, ielative clause position is genealogically stable (Scenaiio B in
Section z) oi, alteinatively, that theie is a univeisal tiend favoiing pie-nominal clauses (Scenaiio
A). Te e1 iatio in Table za is suggestive of a diiectional tiend, but the counts aie small and
exclude data fiom small families and isolates.
ln iesponse to this, l peifoimed extiapolations following the same pioceduie as desciibed in
Section , sepaiately foi each condition. Using the bias piobability estimates of .v foi non-nal
and .I foi nal woid oidei, this iesults in the mean estimates summaiized in Table zb.
' Te single veib-nal Aiawakan language in Diyeis (zccb) database is Taiiana, but this language would in fact
seem to be moie accuiately coded as lacking a dominant oidei (Aikhenvald zccI). On eithei analysis, Aiawakan
iequiies pseudo-gioups until possible subgioupings aie iobustly established.
' Although not commonly used in typology, the odds iatio () is a standaid and useful statistic to compaie piopoi-
tions acioss conditions. lt is dened as the iatio between the odds, and so an odds iatio of about ee means that
the odds foi biases towaids post-nominal ielative clauses aie ee times highei in non-veib-nal than in veib-nal
families.
DRAFT Maich 11, zc11
zc Distributional biases
Te extiapolations conim the ist nding fiom the laige families theie is a signicant
tiend foi families to be biased towaids post-nominal ielative clauses undei non-veib-nal con-
ditions (lishei lxact test, p .cc1,

.vs). Te mean estimated numbei of non-veib-nal
families with pie-nominal ielative clauses is 1.v. Tis guie iesults fiom the fact apait fiom
Sinitic, in 1sv out of zccc extiapolations (i.e. in v), an additional non-veib-nal language
was estimated to iepiesent a family with an oiiginal bias towaids pie-nominal ielative clauses.
Tis language is the Sino-Tibetan language Bai (Wieisma zccI), which has SVO main clauses
and pie-nominal ielative clauses. Te exact position of Bai within Sino-Tibetan is contiovei-
sial, and Nichols & Bickels (zccv) taxonomy tieats Bai as a stock-level isolate. lt is possible
that Bai comes fiom a bianch that oiiginally had post-nominal ielative clauses and changed
to pie-nominal oidei undei Sinitic inuence, i.e. that Bai comes fiom a diveise bianch. lt is
also possible, howevei, that Bai inheiited pie-nominal ielative clauses fiom one of its ancestois
(which might have changed to pie-nominal oidei eailiei, again possibly undei Sinitic inuence
oi even identity with pioto-Sinitic). Without fuithei ieconstiuction and detailed compaiative
woik, it is impossible to decide between these scenaiios. All that we know foi good is that
Bai now has pie-nominal ielative clauses and that, woildwide, the position of ielative clauses
in non-veib-nal languages is ielatively stable ovei time (estimated at .v). Tis favois a
scenaiio wheieby Bai inheiited its oideiing piinciples fiom its bianch ancestoi and does not
ieect iecent change undei Sinitic inuence. ln the extiapolations, this high bias estimate of
.v iesults in Bai being taken to ieect a bias towaids pie-nominal ielative clauses in v of the
zccc extiapolations, pushing the mean numbei up to 1.v.
Te second nding fiom the laige families was that theie is a possible piefeience foi pie-
nominal ielative clauses undei veib-nal conditions. Table za suggests odds of e1 foi this. But
this is not conimed by the extiapolations in Table zb, wheie the odds (zv.IIz.z .vz) go in a
dieient diiection but aie not signicantly dieient fiom1 anyway (p .1s, .e). Tis makes
it likely that theie is no diiectional bias and that instead the bias stiength of .I ieects a
faii degiee of genealogical stability (cf. Scenaiio B in Section z).
6.2 Example 2: hotbeds of pronominal gender
At the end of Section we obseived tentative evidence foi univeisal piessuie against pionom-
inal gendei. Howevei, as Nichols (1vvz, zccI) notes foi gendei in geneial, pionominal gen-
dei tends to be beuei ietained in families when they clustei togethei with similai families in
hotbeds while the phenomenon does not appeai to spiead easily, its ietention seems to be
favoied in specic iegions.
ln oidei to test this hypothesis, l classied the data fiomSiewieiska (zcc) into thiee hotbeds
based on Nicholss (1vvz) suggestions Afiica (including adjacent Semitic languages), (Westein)
luiope (up to a line fiom the Caipathians following the Wisa to the Baltic see, cf. Nichols &
Bickel zccv) and the Sahul aiea (including neai islands up to the Wallace line and collapsing the
stiata postulated by Nichols 1vvb, see cf. Map 1). l then computed family biases within and
outside the hotbeds.
Te dataset contains 1 laige families (with at least ve membeis), I small families, and
1z isolates. Most families aie located completely within oi outside hotbeds, but I of the z
(e) families aie split lndo-luiopean, Uialic, and Austionesian. Within lndo-luiopean, non-
DRAFT Maich 11, zc11
Distributional biases z1
Map 1 Te distiibution of pionominal gendei acioss hotbeds (Siewieiska zcc, with some ad-
ditions in South Ameiica and Papua New Guinea). / Afiica, / (Westein) luiope, /
Sahul, / iest of the woild, lled symbols denote piesence, empty symbols absence of
pionominal gendei
split taxa can be found at the majoi bianch level except foi Balto-Slavic which accoiding to
Nichols & Bickels (zccv) naiiow denition of luiope splits into subbianches within (West
Slavic) and subbianches (last Slavic and Baltic) outside the luiopean hotbed. Te same is tiue
of Uialic, wheie non-split taxa can only be found at the lowest taxonomic levels since even
linno-Ugiic is split by the naiiow denition (leaving Hungaiian inside the luiopean hotbed
and linnish outside). Te situation is again similai in Austionesian wheie the Sahul hotbed
boundaiy ciosscuts the Oceanic and Cential Malayo-Polynesian subgioups. As a iesult, non-
split gioups can only be found at ielatively shallow taxonomic levels, each with small numbeis
of membeis (below ). Te iest of Malayo-Polynesian (the Westein Malayo-Polynesian non-
clade of Nichols &Bickel zccv) is again split and foi lack of established subgiouping, it is divided
heie foi statistical puiposes into a laige pseudo-gioup (N s) outside and smallei pseudo-gioup
(N 1) inside the Sahul hotbed.
Te splits leave a total of 1s laige unsplit gioups (with moie than ve membeis), tabulated
in Table Ia. Tis is a small numbei to base statistical estimates on. loi the Laplace estimatoi (cf.
) this means lost of piecision and moie iandom guessing on the extent to which small families
iepiesent biases. To some extent this is compensated by the iesampling stiategy desciibed in
Section , but the iesults must cleaily be taken as pieliminaiy. On the basis of Table Ia the
bias estimatoi is .c inside and .s outside the hotbeds as expected, it is a bit moie
likely foi families to be biased (in some diiection) inside than outside the hotbeds. Te bias
estimatois iesult in a mean extiapolation to small families as given in Table Ib. An analysis
of the extiapolations shows that the odds foi biases towaids gendei aie

z.I times highei
inside than outside the hotbeds, which is signicant undei a lishei lxact Test (p .czv). Tis
suppoits the hypothesis that gendei is beuei pieseived in families when they clustei togethei
in hotbeds.
DRAFT Maich 11, zc11
zz Distributional biases
inside outside Sum
diveise z 1 e
with gendei 1 c 1
without gendei z e s
Sum s 1c 1s
(a) laige families only
inside outside Sum
diveise zv.cI 1.I e.e
with gendei zv.1c 1.cs 1.zv
without gendei Iv.s 1e.1v vc.v
Sum vs.cc 11.cc z1I.cc
(b) with extiapolation to small families
Table I lamily biases in pionominal gendei inside vs. outside hotbeds
7 Discussion
Te family bias estimates iepoited heie aie pieliminaiy and cleaily need fai moie densei sam-
pling of families. Tis is a sampling stiategy that is the exact opposite of what has been iec-
ommended in the past, wheie typologists have emphasized that samples should avoid picking
many iepiesentatives fiom the same families, i.e. that they should be genealogically balanced.
lt is instiuctive to compaie the iesults iepoited heie to the conclusions that one might diaw on
the basis of genealogically balanced sampling.
Since Diyei (1vsv), the standaid in the eld has been to cieate samples in which each family
contiibutes one single datapoint. When families aie diveise, e.g. some membeis have gendei
and otheis dont, they aie sometimes counted as contiibuting seveial datapoints (and theie
aie moie oi less iened methods foi dealing with this, cf. Bickel zccsb). Te key point of the
method, howevei, is that families aie always tieated statistically in the same way as isolates
and that any tiends oi biases within families is ignoied. Te iesult of such a pei-family count
(using Bickels (zccsb) algoiithm) is given in Table 1. Unlike in Table Ib, the odds iatio of this
table is not neai signicance undei a lishei lxact test (

1.1, p .z1), i.e. theie is no evidence


foi a highei chance of hotbed languages to have pionominal gendei.
inside outside Sum
with gendei Iv 1z s1
without gendei s v1 11v
Sum v 1II zIc
Table 1 Te distiibution of pionominal gendei inside vs.
outside hotbeds in a genealogically balanced database
Te dieience in the iesults is not one of statistical powei since the sample sizes aie compa-
iable. Te dieience is a mauei of methodological piinciple. Genealogically balanced sampling
makes the implicit assumption that if a featuie is shaied by the daughtei languages of a family,
this can only ieect faithful ieplication, with no othei motivation oi cause than sheei ineitia
a featuie is pieseived just because the paient language had it and speakeis aie conseivative.
Teiefoie, if one wants to test foi factois that might inuence the featuie, such as location in-
DRAFT Maich 11, zc11
Distributional biases zI
side a hotbed oi some stiuctuial condition (e.g. woid oidei), one should not count all languages
inside the same family as independent datapoints but instead ieduce the data to genealogically
independent datapoints, i.e. a genealogically balanced dataset.
Howevei, theie is no ieason to assume that ietention must be fiee of conditions oi causes in
fact, ietention can and ofen is favoied by univeisal piefeiences oi aieal factois. Tis becomes
paiticulaily cleai in hypotheses that aie specically about the conditions undei which featuies
aie best ietained (most stable) such as Nicholss (zccI) hypothesis on gendei that is tested
heie what is at stake is whethei families tend to ietain pionominal gendei moie ofen inside
iathei than outside hotbeds. Because ietention of a featuie in a family necessaiily leads to a
bias towaids that featuie in the synchionic daughtei languages, this hypothesis diiectly piedicts
that we nd moie families biased towaids gendei inside than outside hotbeds. Te iesults in
Table Ib conim this.
Genealogically balanced sampling, by contiast, iemoves all data we have on inheiitance
paueins within families and theiefoie makes it impossible to test the hypothesis. As a iesult,
the data in Table 1 display some aspects of the synchionic distiibution of gendei, but it does
not allow any infeience on the piocesses that lead to this distiibution. But this is in conict
with the veiy natuie of typological hypothesis synchionic distiibutions (except peihaps foi
those of cieole and sign languages) must come fiomsomewheie, and the only way a typological
factoi can play a iole in these distiibutions is by aecting the way languages change ovei time.
Since they aie hypothesis on diachionic changes, the only possible way to test them is to tiy
and estimate the extent to which changes lead to systematic biases acioss families.
8 Conclusions
ln this chaptei l pioposed a theoiy that links types of family distiibutions to specic histoiical
scenaiios. Te key evidence foi the theoiy is that the numbei of typological changes in known
families is fai highei than what would be expected if typological distiibutions had peisisted
ovei deep time, going back to the oiigins of the human language faculty. Tis casts doubt on
any auempt to use typological data foi assessing the kind of language that the ist iepiesenta-
tives of oui species spoke. At the same time, the theoiy pioposed heie suggests that typological
distiibutions aie systematically diiven by the inteiaction of high-delity ieplication with vai-
ious kinds of exteinal piessuie, such as univeisal piinciples and aieal diusion tiends.
Teiefoie, distiibutional biases in families do not allow a diiect and geneializing estimation
of specically genealogical stability (pace Paikvall zccs, Wichmann & Holman zccv oi Bakkei
et al. zccv). Any such estimation needs to factoi in the possible eects of exteinal piessuie, and
this means that any piogiess in estimating stability indices depends on oui knowledge of such
piessuie, including the eects of univeisals. Rathei than leading to sweeping hypotheses like
pionominal gendei is stable, the analyses piesented above suggest that pionominal gendei
is signicantly moie stable inside than outside hotbeds, but that in both situations theie is in
fact a tiend foi families not to develop gendei in the ist place. Similaily, instead of detei-
mining whethei the position of ielative clauses is geneially stable oi unstable, we found that
this depends on woid oidei conditions undei veib-nal conditions, ielative clause position is
genealogically faiily stable, i.e. it seems to just follow whatevei the ancestoi language had. But
DRAFT Maich 11, zc11
z1 Distributional biases
undei non-veib-nal conditions, theie is stiong univeisal piessuie foi biasing families towaids
post-nominal position.
Tis conims Nicholss (zccI) insight that the oveiall stability of a typological piopeity
is a mauei of competing foices and typically iesults fiom the combined eects of faithful
ieplication and exteinal piessuie. A full undeistanding of language change and typological
distiibutions must simultaneously engage in ieseaich on univeisals, language contact eects,
and the extent to which paueins aie faithfully ieplicated. Teie is no shoitcut.
Tese ndings also have a piactical consequence because all estimates of inheiitance and
exteinal piessuie, including any extiapolation to isolates, aie based on distiibutions in known
families, typological databases need to sample families as densely as possible. Tis suggests
a iadical move away fiom the classical one-language-pei-family sampling stiategy that has
dominated database development in the past.
DRAFT Maich 11, zc11
Distributional biases z
References
Aikhenvald, Alexandia Y., zccI. A grammar of Tariana. Cambiidge Cambiidge Univeisity Piess.
Bakkei, Dik, Andi Mllei, Viveka Velupillai, Soien Wichmann, Cecil H. Biown, Pamela Biown,
Dmitiy lgoiov, Robeit Mailhammei, Anthony Giant, & liic W. Holman, zccv. Adding typology to
lexicostatistics a combined appioach to language classication. Linguistic Typology 1I, 1ev 1s1.
Bickel, Balthasai, zcc. Typology in the z1st centuiy majoi cuiient developments. Linguistic Typology
11, zIv z1.
Bickel, Balthasai, zccsa. A geneial method foi the statistical evaluation of typological distiibutions.
Ms. Univeisity of Leipzig, [http://www.uni-leipzig.de/~bickel/research/papers/
testing_universals_bickelzccs.pdf].
Bickel, Balthasai, zccsb. A iened sampling pioceduie foi genealogical contiol. Language Typology and
Universals e1, zz1zII.
Bickel, Balthasai & Johanna Nichols, zcc. lxponence of selected inectional foimatives. ln
Haspelmath, Maitin, Mauhew S. Diyei, David Gil, & Beinaid Comiie (eds.) e world atlas of
language structures, vc vI. Oxfoid Oxfoid Univeisity Piess.
Blevins, Julieue, zcc1. Evolutionary phonology: the emergence of sound paerns. New Yoik Cambiidge
Univeisity Piess.
Bybee, Joan, 1vss. Te diachionic dimension in explanation. ln Hawkins, John A. (ed.) Explaining
language universals, Ic Iv. Oxfoid Blackwell.
Cysouw, Michael, zc1ca. Dealing with diveisity towaids an explanation of NP-inteinal woid oidei
fiequencies. Linguistic Typology 11, zIzse.
Cysouw, Michael, zc1cb. On the piobability distiibution of typological fiequencies. ln lbeit, Chiistian,
Geihaid Jgei, & Jens Michaelis (eds.) e Mathematics of Language, zv I. Spiingei.
Dahl, Osten & Viveka Velupillai, zcc. Tense and aspect. ln Haspelmath, Maitin, Mauhew S. Diyei,
David Gil, & Beinaid Comiie (eds.) e world atlas of language structures, zee zsz. Oxfoid Oxfoid
Univeisity Piess.
Daniel, Michael, zcc. Pluiality in lndependent Peisonal Pionouns. ln Haspelmath, Maitin, Mauhew S.
Diyei, David Gil, & Beinaid Comiie (eds.) e world atlas of language structures, 11e11v. Oxfoid
Oxfoid Univeisity Piess.
Diyei, Mauhew S., 1vsv. Laige linguistic aieas and language sampling. Studies in Language 1I, z
zvz.
Diyei, Mauhew S., 1vvz. Te Gieenbeigian woid oidei coiielations. Language es, s1 1Is.
Diyei, Mauhew S., zcca. Oidei of ielative clause and noun. ln Haspelmath, Maitin, Mauhew S. Diyei,
David Gil, & Beinaid Comiie (eds.) e world atlas of language structures, IeeIev. Oxfoid Oxfoid
Univeisity Piess.
Diyei, Mauhew S., zccb. Oidei of subject, object, and veib. ln Haspelmath, Maitin, Mauhew S. Diyei,
David Gil, & Beinaid Comiie (eds.) e world atlas of language structures, IIc I11. Oxfoid
Univeisity Piess.
Diyei, Mauhew S., zccc. Polai questions. ln Haspelmath, Maitin, Mauhew S. Diyei, David Gil, &
Beinaid Comiie (eds.) e world atlas of language structures, 1c1I. Oxfoid Oxfoid Univeisity
Piess.
Diyei, Mauhew S., zccd. Position of case axes. ln Haspelmath, Maitin, Mauhew S. Diyei, David Gil,
& Beinaid Comiie (eds.) e world atlas of language structures, z1c z1I. Oxfoid Oxfoid Univeisity
Piess.
Gieenbeig, Joseph H., 1veI. Some univeisals of giammai with paiticulai iefeience to the oidei of
meaningful elements. ln Gieenbeig, Joseph H. (ed.) Universals of Language, I 11I. Cambiidge,
DRAFT Maich 11, zc11
ze Distributional biases
Mass. MlT Piess.
Gieenbeig, Joseph H., 1vv. Te diachionic typological appioach to language. ln Shibatani, Masayoshi
& Teodoia Bynon (eds.) Approaes to language typology, 11I 1ee. Oxfoid Claiendon.
Hall, Chiistophei J., 1vss. lntegiating diachionic and piocessing piinciples in explaining the suxing
piefeience. ln Hawkins, John A. (ed.) Explaining language universals, Iz1 I1v. Oxfoid Blackwell.
Haspelmath, Maitin, 1vvv. Optimality and diachionic adaptation. Zeitsri r Sprawissensa 1s,
1sc zc.
Haspelmath, Maitin, Mauhew S. Diyei, David Gil, & Beinaid Comiie (eds.), zcc. e world atlas of
language structures. Oxfoid Oxfoid Univeisity Piess.
Hawkins, John A., 1vv1. A performance theory of order and constituency. Cambiidge Cambiidge
Univeisity Piess.
Maslova, llena, zccca. A dynamic appioach to the veiication of distiibutional univeisals. Linguistic
Typology 1, Ic III.
Maslova, llena, zcccb. Stochastic models in typology obstacle oi pieiequisite` Linguistic Typology 1,
I Ie1.
Maslova, llena & Tatiana Nikitina, zcc. Stochastic univeisals and dynamics of cioss-linguistic
distiibutions the case of alignment types. Ms. Stanfoid Univeisity,
http://anothersumma.net/Publications/Ergativity.pdf.
McBieaity, Sally & Alison S. Biooks, zccc. Te ievolution that wasnt a new inteipietation of the
oiigin of modein human behavioi. Journal of Human Evolution Iv, 1I eI.
Mous, Maaiten, 1vv. Te position of the object in Tunen. ln Dchaine, Rose-Maiie & Victoi Manfiedi
(eds.) Object Positions in Benue-Kwa, 1zI1I. Te Hague Holland Academic Giaphics.
Nichols, Johanna, 1vvz. Linguistic diversity in space and time. Chicago Te Univeisity of Chicago Piess.
Nichols, Johanna, 1vva. Modeling ancient population stiuctuies and population movement in
linguistics and aicheology. Annual Review of Anthropology ze, Iv Is1.
Nichols, Johanna, 1vvb. Spiung fiom two common souices Sahul as a linguistic aiea. ln McConvell,
Patiick (ed.) Areology and linguistics: global perspectives on Ancient Australia. Melbouine.
Nichols, Johanna, zccz. Monogenesis oi polygenesis` Typological peispectives on language oiigins.
Plenaiy lectuie at the Annual Meeting of the Linguistic Society of Ameiica, Januaiy I, zccz.
Nichols, Johanna, zccI. Diveisity and stability in language. ln Janda, Richaid D. & Biian D. Joseph
(eds.) Handbook of Historical Linguistics, zsI I1c. London Blackwell.
Nichols, Johanna & Balthasai Bickel, zccv. Te ~U1o1vv genealogy and geogiaphy database zccv
ielease. llectionic database, http://www.uni-leipzig.de/~autotyp.
Paikvall, Mikael, zccs. Which paits of language aie most stable` Language Typology and Universals e1,
zI1 zc.
R Development Coie Team, zc11. R: a language and environment for statistical computing. Vienna R
loundation foi Statistical Computing, http://www.r-project.org.
Siewieiska, Anna, zcc. Gendei distinctions in independent peisonal pionouns. ln Haspelmath,
Maitin, Mauhew S. Diyei, David Gil, & Beinaid Comiie (eds.) e world atlas of language structures,
1sz1s. Oxfoid Oxfoid Univeisity Piess.
Stassen, Leon, zcc. Piedicative adjectives. ln Haspelmath, Maitin, Mauhew S. Diyei, David Gil, &
Beinaid Comiie (eds.) e world atlas of language structures, 1s 1s1. Oxfoid Oxfoid Univeisity
Piess.
Tomason, Saiah Giey & Teiience Kaufman, 1vss. Language contact, creolization, and genetic
linguistics. Beikeley Univeisity of Califoinia Piess.
Wichmann, Soien & liic W. Holman, zccv. Temporal stability for linguistic typological features.
Munich iiNco: iUvov~.
DRAFT Maich 11, zc11
Distributional biases z
Wieisma, Giace, zccI. Yunnan Bai. ln Tuigood, Giaham & Randy J. LaPolla (eds.) e Sino-Tibetan
languages, e1 eI. London Routledge.
Yue, Anne O., zccI. Chinese dialects giammai. ln Tuigood, Giaham & Randy J. LaPolla (eds.) e
Sino-Tibetan languages, s11zI. London Routledge.
DRAFT Maich 11, zc11

You might also like