You are on page 1of 27

Random Variables and

Probability Distributions
The concept of probability is the key to making
statistical inferences by sampling a population
What we are doing is trying to ascertain the probability
of an event having a given outcome, e.g.
We summarize a sample statistically and want to
make some inferences about it, such as what
proportion of the population has values within a
given range we could do this by finding the area
under the curve in a frequency distribution
This requires us to be able to specify the distribution of
a variable before we can make inferences
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Distributions
Previously, we looked at some proportions of area under
the normal curve:

Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and Quantitative
Analysis. USA: Macmillan College Publishing Co., p. 100.

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Distributions
BUT before we could use the normal curve to draw
inferences about some sample, we have to find out if this
is the right distribution for our variable
While many natural phenomena are normally distributed,
there are other phenomena that are best described using
other distributions
This section of the course will begin with some
background on probabilities (terminology & rules),
and then we will examine a few useful distributions:
Discrete distributions: Binomial and Poisson
Continuous distributions: Normal and its relatives
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Some Definitions


Probability Refers to the likelihood that something
(an event) will have a certain outcome
An Event Any phenomenon you can observe that can
have more than one outcome (e.g. flipping a coin)
An Outcome Any unique condition that can be the
result of an event (e.g. the available outcomes when
flipping a coin are heads and tails), a.k.a. simple events
or sample points
Sample Space The set of all possible outcomes
associated with an event (e.g. the sample space for
flipping a coin includes heads and tails)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example
For example, suppose we have a data set where in six
cities, we count the number of malls located in that city
present:
City # of Malls
Outcome #1
1
1
Each
2
4
count of
Sample
3
4
Outcome #2
the # of
Space
malls in
4
4
Outcome #3
a city is
5
2
an event
Outcome #4
6
3
We might wonder if we randomly pick one of these six
cities, what is the chance that it will have n malls?
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Functions
What we have here is a random variable
defined as variable X whose range is values xi are
sampled randomly from a population
To put this another way, a random variable is a
function defined on the sample space this
means that we are interested in all the possible
outcomes
The question was: If we randomly pick one of
the six cities, what is the chance that it will have
n malls?
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Random Variables and


Probability Functions
To answer this question, we need to form a
probability function (a.k.a. probability
distribution) from the sample space that gives all
values of a random variable and their probabilities
A probability distribution expresses the relative
number of times we expect a random variable to
assume each and every possible value
We either base a probability function on either a
very large empirically-gathered set of outcomes,
or else we determine the shape of a probability
function mathematically

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example, Part II


Here, the values of xi are drawn from the four outcomes,
and their probabilities are the number of events with each
outcome divided by the total number of events:
City
# of Malls
Outcome #1
1
1
xi P(xi)
2
4
1 1/6 = 0.167
3
4
Outcome #2
2 1/6 = 0.167
4
4
3 1/6 = 0.167
Outcome #3
5
2
4 3/6 = 0.5
Outcome #4
6
3
The probability of an outcome P(xi) =

# of times an outcome occurred


Total number of events
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability An Example, Part III


We can plot this probability distribution as a probability
mass function:
P(xi)
1/6 = 0.167
1/6 = 0.167
1/6 = 0.167
3/6 = 0.5

0.50

P(xi)

xi
1
2
3
4

0.25

xi
This plot uses thin lines to denote that the probabilities are
massed at discrete values of this random variable
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Discrete Random Variables


The random variable from our example is a
discrete random variable, because it has a finite
number of values (i.e. a city can have 1 or 2
malls, but it cannot have 1.5 malls)
Any variable that is generated by counting a
whole number of things is likely to be a discrete
variable (e.g. # of coin tosses in a row with heads,
questionnaire responses where one of a set of
ordinal categories must be chosen, etc.)
A discrete random variable can be described by a
probability mass function

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Mass Functions


Probability mass functions have the following
rules that dictate their possible values:
1. The probability of any outcome must be greater
than or equal to zero and must also be less than or
equal to one, i.e.
0 P(xi) 1 for i = {1, 2, 3, , k-1, k}
2. The sum of all probabilities in the sample space
must total one, i.e. i=k

P(xi) = 1
i=1

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Discrete Random Variables


We can calculate the mean of a discrete
probability distribution by taking all possible
values of the variable, multiplying them by their
probability, and summing them over the values:
i=k

= xi *P(xi)
i=1

The symbol is used here rather than x because


the basic idea of a probability distribution is to
use a large number of values to approach a
stable estimate of the parameter
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Discrete Random Variables


We can also calculate the variance of a discrete
probability distribution by calculating the sum of
squares for all possible values of the variable,
multiplying them by their probability, and
summing them over the values:

2 =

i=k

2*P(x )
(x

x)
i
i
i=1

These formulae are only useful for discrete


probability distributions, for continuous
probability dists. a different method is required
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Continuous Random Variables


Continuous random variable can assume all
real number values within an interval, for
example: measurements of precipitation, pH, etc.
Some random variables that are technically
discrete exhibit such a tremendous range of
values, that is it desirable to treat them as if they
were continuous variables, e.g. population
Discrete random variables are described by
probability mass functions, and continuous
random variables are described by probability
density functions

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Density Functions


Probability density functions are defined using
the same rules required of probability mass
functions, with some additional requirements:
1. The function must have a non-negative value
throughout the interval a to b, i.e.
f(x) 0 for a x b
2. The area under the curve defined by f(x), within
the interval a to b, must equal 1:
f(x)

area=1

x
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Density Functions


Theoretically, a continuous variables range can
extend from negative infinity to infinity, e.g. the
normal distribution:
f(x)

area=1
x

The tails of the normal distributions curve


extend infinitely in each direction, but the value
of f(x) approaches zero asymptotically, getting
closer and closer, but never reaching zero
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Density Functions


Suppose we are interested in computing the
probability of a continuous random variable
falling within a range of values bounded by lower
limit c and upper limit d, within the interval a to b
How can we find the probability of a
value occurring between c and d?

c
f(x)

d
b

a
x

We need to calculate the shaded area if we


know the density function, we could use calculus:
c

P(x) =
d

f(x) dx
d

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Density Functions


Fortunately, we do not need to solve the integral
ourselves to practice statistics instead, if we
can match the f(x) up to some known
distribution, we can use a table of probabilities
that someone else has developed
Tables A.2 through A.6 in the epilogue of the
Rogerson text (pp. 214-221) give probability
values for several distributions, including the
normal distribution and some related distributions
used by various inferential statistics (you can find
tables like these at the end of most statistics texts)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Density Functions


Suppose we are interested in computing the probability
of a continuous random variable at a certain value of x
(e.g. at d):
Can we find the probability of a
value occurring at d? P(d) = ?
No, P(d) = 0 why?
The reasons is:

d
f(x)

a
x

P(x) 0 as c d
d

To put this another way, as the interval from c to d


becomes vanishingly narrow, the area below the curve
within it becomes vanishingly small
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
Now that we have described how to apply individual
unconditional probabilities, we can move onto to looking
at the rules for combining multiple probabilities
A useful aid when discussing probabilities is the Venn
diagram, that depicts multiple probabilities and their
relationships using a graphical depiction of sets:
The rectangle that forms the area
of the Venn Diagram represents the
sample (or probability) space,
which we have defined above

Figures that appear within the


sample space are sets that represent
events in the probability context, &
their area is proportional to their
probability (full sample space = 1)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
We can use a Venn diagram with to describe the
relationships between two sets or events, and the
corresponding probabilities:
The union of sets A and B
(written symbolically is A B) is
represented by the areas enclosed
by set A and B together, and can be
expressed by OR (i.e. the union of
the two sets includes any location
in A or B, i.e. blue OR red)
The intersection of sets A and B
(written symbolically as A B) is
the area that is overlapped by both
the A and B sets, and can be
expressed by AND (I.e. the
intersection of the two sets includes
locations in A AND B, i.e. purple)

David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
If the sets A and B do not overlap in the Venn diagram,
the sets are disjoint, and this represents a case of two
independent, mutually exclusive events
The union of sets A and B here uses the
addition rule, where

P(A) = P(A) + P(B)


You can think of this in terms of areas of
the events, where the union in this case is
simply the sum of the areas
The intersection of sets A and B here
results in the empty set (symbolized by ),
because at no point do the circles overlap
(no purple area as there was in the previous
Venn diagram), thus there is no intersection
Unconditional probabilities ~ the outcome
of one event does not effect the other

P(A) = P(A) + P(B)

P(A) =
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
For example, suppose set A represents a roll of 1 or 2 on
a 6-sided die, so P(A)=2/6, and set B represents a roll of
3 or 4, so P(B)=2/6:
The union of sets A and B here uses the
addition rule, where

P(A) = P(A) + P(B)


P(A) = 2/6 + 2/6
P(A) = 4/6 = 2/3 = 0.6
The outcomes represented here are
mutually exclusive, thus there is no
intersection between sets A and B, thus
P(A) =

P(A) = P(A) + P(B)

P(A) =
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
If the sets A and B do overlap in the Venn diagram, the
sets are not mutually exclusive, and this represents a
case of independent, but not exclusive events
The union of sets A and B here is
P(A) = P(A) + P(B) - P(A)
because we do not wish to count the
intersection area twice, thus we need to
subtract it from the sum of the areas of A
and B when taking the union of a pair of
overlapping sets
The intersection of sets A and B here is
calculated by taking the product of the two
probabilities, a.k.a. the multiplication rule:

P(A) = P(A) + P(B) - P(A)

P(A) = P(A) * P(B)


P(A) = P(A) * P(B)
David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
Consider set A to give the chance of precipitation at
P(A)=0.4 and set B to give the chance of below freezing
temperatures at P(B)=0.7
The intersection of sets A and B here is
P(A) = P(A) * P(B)

P(A) = 0.4 * 0.7 = 0.28


This expresses the chance of snow at
P(A) = 0.28

P(A) = P(A) * P(B)

The union of sets A and B here is


P(A) = P(A) + P(B) - P(A)
P(A) = 0.4 + 0.7 0.28 = 0.82
This expresses the chance of below freezing
temperatures or precipitation occurring at
P(A) = 0.82

P(A) = P(A) + P(B) - P(A)


David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
Consider set A to give the chance of precipitation at
P(A)=0.4 and set B to give the chance of below freezing
temperatures at P(B)=0.7
The complement of set A is

P(A) = 1 - P(A)
P(A) = 1 0.4 = 0.6
This expresses the chance of it not raining
or snowing at P(A) = 0.6

P(A) = 1 [P(A) + P(B) - P(A)]


P(A) = 1 [0.4 + 0.7 0.28] = 0.18
This expresses chance of it neither raining
nor being below freezing at P(A) = 0.18

P(A) = 1 - P(A)

P(A)

The complement of the union of sets A


and B is

P(A) = 1 [P(A) + P(B) - P(A)]


David Tenenbaum GEOG 090 UNC-CH Spring 2005

Probability Rules
We can also encounter the situation where set A is fully
contained within set B, which is equivalent to saying
that set A is a subset of set B:
In probability terms, this situation
occurs when outcome B is a
necessary precondition for
outcome A to occur, although not
vice-versa (in which case set B
would be contained in set A
instead)

B
A

For example, set A might represent rolling a 5 using a 6sided die, where set B denotes any roll greater than 3
A is contained with B because anytime A occurs, B
occurs as well

David Tenenbaum GEOG 090 UNC-CH Spring 2005

You might also like