You are on page 1of 35

Probability Some Definitions

Probability Refers to the likelihood that something


(an event) will have a certain outcome
An Event Any phenomenon you can observe that can
have more than one outcome (e.g. flipping a coin)
An Outcome Any unique condition that can be the
result of an event (e.g. the available outcomes when
flipping a coin are heads and tails), a.k.a. simple events
or sample points
Sample Space The set of all possible outcomes
associated with an event (e.g. the sample space for
flipping a coin includes heads and tails)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example
For example, suppose we have a data set where in six
cities, we count the number of malls located in that city
present:
City # of Malls
Outcome #1
1
1
Each
2
4
count of
Sample
3
4
Outcome #2
the # of
Space
malls in
4
4
Outcome #3
a city is
5
2
an event
Outcome #4
6
3
We might wonder if we randomly pick one of these six
cities, what is the chance that it will have n malls?
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Functions
What we have here is a random variable
defined as variable X whose range of values xi are
sampled randomly from a population
To put this another way, a random variable is a
function defined on the sample space this
means that we are interested in all the possible
outcomes
The question was: If we randomly pick one of
the six cities, what is the chance that it will have
n malls?
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Functions
To answer this question, we need to form a
probability function (a.k.a. probability
distribution) from the sample space that gives all
values of a random variable and their probabilities
A probability distribution expresses the relative
number of times we expect a random variable to
assume each and every possible value
We either base a probability function on either a
very large empirically-gathered set of outcomes,
or else we determine the shape of a probability
function mathematically

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example, Part II


Here, the values of xi are drawn from the four outcomes,
and their probabilities are the number of events with each
outcome divided by the total number of events:
City
# of Malls
Outcome #1
1
1
xi P(xi)
2
4
1 1/6 = 0.167
3
4
Outcome #2
2 1/6 = 0.167
4
4
3 1/6 = 0.167
Outcome #3
5
2
4 3/6 = 0.5
Outcome #4
6
3
The probability of an outcome P(xi) =

# of times an outcome occurred


Total number of events
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example, Part III


We can plot this probability distribution as a probability
mass function:
P(xi)
1/6 = 0.167
1/6 = 0.167
1/6 = 0.167
3/6 = 0.5

0.50

P(xi)

xi
1
2
3
4

0.25

xi
This plot uses thin lines to denote that the probabilities are
massed at discrete values of this random variable
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Discrete Random Variables


The random variable from our example is a
discrete random variable, because it has a finite
number of values (i.e. a city can have 1 or 2
malls, but it cannot have 1.5 malls)
Any variable that is generated by counting a
whole number of things is likely to be a discrete
variable (e.g. # of coin tosses in a row with heads,
questionnaire responses where one of a set of
ordinal categories must be chosen, etc.)
A discrete random variable can be described by a
probability mass function

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Mass Functions


Probability mass functions have the following
rules that dictate their possible values:
1. The probability of any outcome must be greater
than or equal to zero and must also be less than or
equal to one, i.e.
0 P(xi) 1 for i = {1, 2, 3, , k-1, k}
2. The sum of all probabilities in the sample space
must total one, i.e. i=k

P(xi) = 1
i=1

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Discrete Probability Distributions


The remainder of todays lecture will introduce
three kinds of discrete probability distributions
that are useful for us to examine when learning
about statistics:
1. The Uniform Distribution
2. The Binomial Distribution
3. The Poisson Distribution
Each of these probability distributions is
appropriately applied in certain situations and to
particular phenomena
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Uniform Distribution

P(xi)

The uniform distribution describes the situation


where the probability of all outcomes is the same
Expressed as an equation,
A uniform probability
mass function
given n possible outcomes
for an event, the probability
0.50
of any outcome is P(xi) =
1/n, e.g. if we are flipping a
0.25
coin and we find that heads
and tails are equally likely
0
outcomes, then:
heads tails
xi
P(xheads) = 1/2 = P(xtails)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Uniform Distribution


While the idea of a uniform distribution might
seem a little simplistic and perhaps useless, it
actually is well applied in two situations:
1. When the probabilities are of each and every
possible outcome are truly equal (e.g. the coin
toss)
2. When we have no prior knowledge of how a
variable is distributed (i.e. when we are dealing
with complete uncertainty), we first distribution
we should use is uniform, because it makes no
assumptions about the distribution

David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Uniform Distribution


While truly uniformly distributed geographic
phenomena are somewhat rare (remember
Toblers law, which implies variation from the
uniform if it is true), we often encounter the
situation of not knowing how something is
distributed until we sample it
It is in the latter case, when we are resisting
making assumptions about the distribution of
some geographic phenomenon that we usually
apply the uniform distribution as a sort of null
hypothesis of distribution
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Uniform Distribution

P(xNorth) =
P(xEast) =
P(xSouth) =
P(xWest) =

P(xi)

For example, supposing we wanted to wanted to predict


the direction of the prevailing wind at some location
(expressing it in terms of a cardinal direction), and we
had no prior knowledge of the weather systems
tendencies in the area, we would have to begin with the
idea that
0.25
0.125
0

xi
until we had an opportunity to sample and establish some
tendency in the wind pattern based on those observations
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Binomial Distribution


The binomial distribution provides information
about the probability of the repetition of events
when there are only two possible outcomes, e.g.
heads or tails, left or right, success or failure, rain
or no rain any nominal data situation where
there are only two categories / outcomes possible
The binomial distribution is useful for describing
when the same event is repeated over and over,
characterizing the probability of a proportion of
the events having a certain outcome over a
specified number of events
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Binomial Distribution


A binomial distribution is produced by a set of
Bernoulli trials, named after Jacques Bernoulli, a
17th Century Swiss mathematician who formulated
a version of the law of large numbers for
independent trials

The law of large numbers is at the heart of


probability theory, and it states that given enough
observed events, the observed probability should
approach the theoretical values drawn from
probability distributions (e.g. enough coin tosses
should approach the P=0.5 value for the outcomes)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials
A set of Bernoulli trials is the way to
operationally test the law of large numbers using
an event that has two possible outcomes:
1. N independent trials of an experiment (i.e. an
event like a coin toss) are performed; using the
word independent here stipulates that the results
of one trial do not influence the result of the next
2. Every trial must have the same set of possible
outcomes (heads and tails must be the only
available results of coin tosses using other sorts
of experiments, this is a less trivial issue)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials
Bernoulli trials cont.
3. The probability of each outcome must be the
same for all trials, i.e. P(xi) must be the same
each time for both xi values
4. The resulting random variable is determined by
the number of successes in the trials (where we
define successes to be one of the two available
outcomes)
We will use the notation p = the probability of
success in a trial and q = (p 1) as the probability
of failure in a trial; p + q = 1
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials An Example


Suppose on a series of successive days, we will record
whether or not it rains in Chapel Hill
We will denote the 2 outcomes using R when it rains and
N when it does not rain
n

Possible Outcomes

(1 - p)

RR

p2

p2

2[p*(1 p)]

2pq

(1 p)2

q2

p3

p3

RN NR
NN
3

RRR

# of Rain Days

>1 successive events


uses the multiplicative 1
0
rule (intersection) to
calculate the probability 3

P(# of Rain Days)

RRN RNR NRR

3[p2 *(1 p)]

3p2q

NNR NRN RNN

3[p*(1 p)2]

3pq2

NNN

(1 p)3

q3

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials An Example


n

Possible Outcomes

RR

RN NR

NN

# of Rain Days

P(# of Rain Days)


p

0.2

0.8

p2

0.04

2pq

0.32

q2

0.64

RRR

p3

0.008

RRN RNR NRR

3p2q

0.096

NNR NRN RNN

3pq2

0.384

NNN

q3

0.512

=1
=1

=1

If we have a value for P(R) = p, we can substitute it into


the above equations to get the probability of each
outcome from a series of successive samples, e.g.
suppose p=0.2 (and therefore q=0.8, since p + q = 1)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials
A graphical representation:

probability
# of successes
1 event, = (p + q)1 = p + q
2 events, = (p + q)2 = p2 + 2pq + q2
3 events, = (p + q)3
= p3 + 3p2q + 3pq2 + q3

Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and Quantitative
Analysis. USA: Macmillan College Publishing Co., p. 132.

4 events, = (p + q)4
= p4 + 4p3q + 6p2q2 + 4pq3 + q4

The sum of the probabilities can be expressed using the


binomial expansion of (p + q)n, where n = # of events
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials
We can provide a general formula for calculating
the probability of x successes, given n trials and a
probability p of success:

P(x) = C(n,x) * px * (1 - p)n - x


where C(n,x) is the number of possible
combinations of x successes and (n x) failures:

n!
C(n,x) =
x! * (n x)!
where n! = n * (n 1) * (n 2) * * 3 * 2 * 1
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials - Example


For example, the probability of 2 successes in 4
trials, given p=0.2 is:

n!
P(x) =
* px *(1 - p)n - x
x! * (n x)!
24
P(x) =
* (0.2)2 * (0.8)2
2*2
P(x) = 6 * (0.04)*(0.64) = 0.1536
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials - Example


Calculating the probabilities of all possible outcomes of
the number of rain days out of four days, given p = 0.2:
P = product of these terms
x

P(x)

C(n,x)

px

(1 p)n x

P(0)

(0.2)0

(0.8)4

0.4096

P(1)

(0.2)1

(0.8)3

0.4096

P(2)

(0.2)2

(0.8)2

0.1536

P(3)

(0.2)3

(0.8)1

0.0256

P(4)

(0.2)4

(0.8)0

0.0016

We can also calculate the chance of having one or more


days of rain out of four by summing P(1) + P(2) + P(3)
+ P(4) = 0.4096 + 0.1536 + 0.0256 + 0.0016 = 0.5904
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Bernoulli Trials - Example


Naturally, we can plot the probability mass
function produced by this binomial distribution:
P(xi)
0.4096
0.4096
0.1536
0.0256
0.0016

0.50

P(xi)

xi
0

2
3
4

0.25

xi
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


The usual application of probability distributions
is to find a theoretical distribution that reflects a
process such that it explains what we see in some
observed sample of a geographic phenomenon
The theoretical distribution does this by virtue of
the fact that the form of the sampled information
and theoretical distribution can be compared and
be found to similar through a test of significance
One theoretical concept that we often study in
geography concerns discrete random events in
space and time (e.g. where will a tornado occur?)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


The discrete random events in question happen
rarely (if at all), and the time and place of these
events are independent and random

P(xi)

The greatest probability is zero occurrences at a


certain time or place, with a small chance of one
occurrence, an even smaller chance of two
occurrences, etc.
0.5
A distribution with
0.25
these characteristics
will be heavily
0
1
2
3
4
0
xi
peaked and skewed:
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


In the 1830s, French mathematician S.D. Poisson
described a distribution with these characteristics,
used to describe the number of events that will
occur within a certain area or duration (e.g. #
of meteorite impacts per state, # of tornados per
year in Tornado Alley)
The following characteristics describe the
Poisson distribution:
1. It is used to count the number of occurrences
of an event within a given unit of time, area,
volume, etc. therefore a discrete distribution
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


Poisson distribution cont.:
2. The probability that an event will occur within
a given unit must be the same for all units (i.e.
the underlying process governing the
phenomenon must be invariant)
3. The number of events occurring per unit must
be independent of the number of events
occurring in other units (no interactions)
4. The mean or expected number of event per unit
is denoted by and is found by past experience
(observations)

David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


Poisson formulated his distribution as follows:
-
x
e *

P(x) =

x!

where e = 2.71828 (base of the natural logarithm)


= the mean or expected value
x = 1, 2, , n 1, n # of occurrences
x! = x * (x 1) * (x 2) * * 2 * 1
To calculate a Poisson distribution, you must
know and then plug in the values of x where
observations are likely to occur
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


Poisson formulated his distribution as follows:
-
x
e *

P(x) =

x!

The shape of the distribution depends quite


strongly upon the value of , because as
increases, the distribution becomes less skewed,
eventually approaching a normal-shaped
distribution as gets quite large
We can evaluate P(x) for any value of x, but large
values of x will have very small values of P(x)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


The Poisson distribution is sometimes known as
the Law of Small Numbers, because it describes
the behavior of events that are rare, despite there
being many opportunities for them to occur
We can observe the frequency of some rare
phenomenon, find its mean occurrence, and then
construct a Poisson distribution and compare
our observed values to those from the distribution
(effectively expected values) to see the degree to
which our observed phenomenon is obeying the
Law of Small Numbers:
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution - Example


Fitting a Poisson distribution to the 24-hour murder
rates in Fayetteville in a 31-day month (to ask the
question Do murders randomly occur in time?)
Observed Values
x

Obs. Frequency x*Fobs

Expected Values
P(x)

Fexp

17

0.49

15.2

0.35

10.9

0.12

3.7

0.03

0.9

0.005

0.2

31

22

1.000

30.9

Total values

days

e- * x
P(x) =
x!
e-0.71 * 0.71x
P(x) =
x!
Fexp = P(x) * 31

murders

= mean murders per day = 22 / 31 = 0.71

We can compare Fobs


to Fexp using a 2 test
to see if observations
do match Poisson Dist.
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


Procedure for finding Poisson probabilities and
expected frequencies:
1. Set up a table with five columns as on the
previous slide
2. Multiply the values of x by their observed
frequencies (x * fobs)
3. Sum the columns of fobs (observed frequency)
and x * fobs
4. Compute = (x * fobs) / fobs
5. Compute P(x) values using the eqn. or a table
6. Compute the values of Fexp = P(x) * fobs
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


One characteristic of the Poisson distribution is
that we expect the variance ~ mean (i.e. the two
should have approximately the same values)
When we apply the Poisson distribution to
geographic patterns, we can see how a variance
to mean ratio (2:x) of about 1 corresponds to a
random pattern that is distributed according to
Poisson probabilities
Suppose we have a point pattern in an (x,y)
coordinate space and we lay down quadrats and
count the number of points per quadrat
David Tenenbaum GEOG 090 UNC-CH Spring 2005

The Poisson Distribution


Here, the counts of points per quadrat form the
frequencies we use to check Poisson probabilities:

Regular
Low variance
Mean 1
2:x is low

Random
Variance
Mean
2:x ~ 1

Clustered
Low variance
Mean 0
2:x is high

David Tenenbaum GEOG 090 UNC-CH Spring 2005

You might also like