Professional Documents
Culture Documents
Probability Distributions
The concept of probability is the key to making
statistical inferences by sampling a population
What we are doing is trying to ascertain the probability
of an event having a given outcome, e.g.
We summarize a sample statistically and want to
make some inferences about it, such as what
proportion of the population has values within a
given range we could do this by finding the area
under the curve in a frequency distribution
This requires us to be able to specify the distribution of
a variable before we can make inferences
David Tenenbaum GEOG 090 UNC-CH Spring 2005
Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and Quantitative
Analysis. USA: Macmillan College Publishing Co., p. 100.
Probability An Example
For example, suppose we have a data set where in six
cities, we count the number of malls located in that city
present:
City # of Malls
Outcome #1
1
1
Each
2
4
count of
Sample
3
4
Outcome #2
the # of
Space
malls in
4
4
Outcome #3
a city is
5
2
an event
Outcome #4
6
3
We might wonder if we randomly pick one of these six
cities, what is the chance that it will have n malls?
David Tenenbaum GEOG 090 UNC-CH Spring 2005
0.50
P(xi)
xi
1
2
3
4
0.25
xi
This plot uses thin lines to denote that the probabilities are
massed at discrete values of this random variable
David Tenenbaum GEOG 090 UNC-CH Spring 2005
P(xi) = 1
i=1
= xi *P(xi)
i=1
2 =
i=k
2*P(x )
(x
x)
i
i
i=1
area=1
x
David Tenenbaum GEOG 090 UNC-CH Spring 2005
area=1
x
c
f(x)
d
b
a
x
P(x) =
d
f(x) dx
d
d
f(x)
a
x
P(x) 0 as c d
d
Probability Rules
Now that we have described how to apply individual
unconditional probabilities, we can move onto to looking
at the rules for combining multiple probabilities
A useful aid when discussing probabilities is the Venn
diagram, that depicts multiple probabilities and their
relationships using a graphical depiction of sets:
The rectangle that forms the area
of the Venn Diagram represents the
sample (or probability) space,
which we have defined above
Probability Rules
We can use a Venn diagram with to describe the
relationships between two sets or events, and the
corresponding probabilities:
The union of sets A and B
(written symbolically is A B) is
represented by the areas enclosed
by set A and B together, and can be
expressed by OR (i.e. the union of
the two sets includes any location
in A or B, i.e. blue OR red)
The intersection of sets A and B
(written symbolically as A B) is
the area that is overlapped by both
the A and B sets, and can be
expressed by AND (I.e. the
intersection of the two sets includes
locations in A AND B, i.e. purple)
Probability Rules
If the sets A and B do not overlap in the Venn diagram,
the sets are disjoint, and this represents a case of two
independent, mutually exclusive events
The union of sets A and B here uses the
addition rule, where
P(A) =
David Tenenbaum GEOG 090 UNC-CH Spring 2005
Probability Rules
For example, suppose set A represents a roll of 1 or 2 on
a 6-sided die, so P(A)=2/6, and set B represents a roll of
3 or 4, so P(B)=2/6:
The union of sets A and B here uses the
addition rule, where
P(A) =
David Tenenbaum GEOG 090 UNC-CH Spring 2005
Probability Rules
If the sets A and B do overlap in the Venn diagram, the
sets are not mutually exclusive, and this represents a
case of independent, but not exclusive events
The union of sets A and B here is
P(A) = P(A) + P(B) - P(A)
because we do not wish to count the
intersection area twice, thus we need to
subtract it from the sum of the areas of A
and B when taking the union of a pair of
overlapping sets
The intersection of sets A and B here is
calculated by taking the product of the two
probabilities, a.k.a. the multiplication rule:
Probability Rules
Consider set A to give the chance of precipitation at
P(A)=0.4 and set B to give the chance of below freezing
temperatures at P(B)=0.7
The intersection of sets A and B here is
P(A) = P(A) * P(B)
Probability Rules
Consider set A to give the chance of precipitation at
P(A)=0.4 and set B to give the chance of below freezing
temperatures at P(B)=0.7
The complement of set A is
P(A) = 1 - P(A)
P(A) = 1 0.4 = 0.6
This expresses the chance of it not raining
or snowing at P(A) = 0.6
P(A) = 1 - P(A)
P(A)
Probability Rules
We can also encounter the situation where set A is fully
contained within set B, which is equivalent to saying
that set A is a subset of set B:
In probability terms, this situation
occurs when outcome B is a
necessary precondition for
outcome A to occur, although not
vice-versa (in which case set B
would be contained in set A
instead)
B
A
For example, set A might represent rolling a 5 using a 6sided die, where set B denotes any roll greater than 3
A is contained with B because anytime A occurs, B
occurs as well