Probability - Probability Distributions

Probability and Probability
Distributions
Instructor : Basesh Gala
Indian Institute of Quantitative Finance
Conditional probability and

independence
If we know that one event has occurred it may change our
view of the probability of another event. Let
A = {rain today}, B = {rain tomorrow}, C = {rain in 90 days time}
It is likely that knowledge that A has occurred will change

your view of the probability that B will occur, but not of
the probability that C will occur.
We write P(B|A) P(B), P(C|A) = P(C). P(B|A) denotes the
conditional probability of B, given A.
We say that A and C are independent, but A and B are not.
Note that for independent events P(AC) = P(A)P(C).
Conditional probability - tornado

forecasting
Consider the classic data set on the next Slide
consisting of forecasts and observations of
tornados (Finley, 1884).
Let
F = {Tornado forecast}
T = {Tornado observed}
Use the frequencies in the table to estimate

probabilities its a large sample, so estimates
should not be too bad.
Forecasts of tornados
Tornado
observed
No T
observed
Total
Tornado No T
Total
Forecast forecast
28
23
51
72
2680
2752
100
2703
2803

forecasting
P(T) = 51/2803 = 0.0182
P(T|F) = 28/100 = 0.2800
P(T|Fc) = 23/2703 = 0.0085
Knowledge of the forecast changes P(T). F and T are

not independent.
P(F|T) = 28/51 = 0.5490
P(T|F), P(F|T) are often confused but are different

quantities, and can take very different values.

forecasting
P(TF) = 28/2803 = P(T) P(F|T) = P(F)P(T|F)
P(F)P(T).
The two formulae for the probability of an
intersection always hold.
If A, B are independent, then P(A|B) = P(A), P(B|A)
= P(A), so P(AB) = P(A)P(B).
P(B|A) = P(B)P(A|B)/P(A)
This is Bayes Theorem, though in the usual statement

of the theorem P(A) is expanded in a more
complicated-looking fashion.
Random variables
Often we take measurements which have different
values on different occasions. Furthermore, the
values are subject to random or stochastic
variation - they are not completely predictable,
and so are not deterministic. They are random
variables.
Examples are crop yield, maximum temperature,
number of cyclones in a season, rain/no rain.
Continuous and discrete random

variables
A continuous random variable is one which can (in
theory) take any value in some range, for example
crop yield, maximum temperature.
A discrete variable has a countable set of values.
They may be
counts, such as numbers of cyclones

categories, such as much above average, above
average, near average, below average, much below
average
binary variables, such as rain/no rain
Probability distributions
If we measure a random variable many times, we can
build up a distribution of the values it can take.
Imagine an underlying distribution of values which
we would get if it was possible to take more and
more measurements under the same conditions.
This gives the probability distribution for the
variable.
Discrete probability distributions

A discrete probability distribution associates a
probability with each value of a discrete random
variable.
Example 1. Random variable has two values Rain/No
Rain. P(Rain) = 0.2, P(No Rain) = 0.8 gives a
probability distribution.
Example 2. Let X = Number of wet days in a 10 day
period. P(X=0) = 0.1074, P(X=1) = 0.2684, P(X=2) =
0.3020, P(X=6) = 0.0055, ... (see Slide 24 for more
on this example).
Note that P(rain) + P(No Rain) = 1; P(X=0) + P(X=1) +
P(X=2) + +P(X=6) + P(X=10) = 1.
Continuous probability distributions

Because continuous random variables can take all
values in a range, it is not possible to assign
probabilities to individual values.
Instead we have a continuous curve, called a
probability density function, which allows us to
calculate the probability a value within any
interval.
This probability is calculated as the area under the
curve between the values of interest. The total area
under the curve must equal 1.
Families of probability distributions

The number of different probability distributions is
unlimited. However, certain families of
distributions give good approximations to the
distributions of many random variables.
Important families of discrete distributions include
binomial, multinomial, Poisson, hypergeometric,
negative binomial
Important families of continuous distributions
include normal (Gaussian), exponential, gamma,
lognormal, Weibull, extreme value
Families of discrete distributions

We consider only two, binomial and Poisson.
There are many more.
Do not use a particular distribution unless you
are satisfied that the assumptions which
underlie it are (at least approximately)
satisfied.
Binomial distributions
The data arise from a sequence of n independent trials.
At each trial there are only two possible outcomes,
conventionally called success and failure.
The probability of success, p, is the same in each trial.
The random variable of interest is the number of successes,
X, in the n trials.
The assumptions of independence and constant p

in 1, 3 are important. If they are invalid, so is
the binomial distribution
Binomial distributions - examples

It is unlikely that the binomial distribution would be
appropriate for the number of wet days in a period of 10
consecutive days, because of non-independence of rain on
consecutive days.
It might be appropriate for the number of frost-free Januarys,
or the number of crop failures, in a 10-year period, if we
can assume no inter-annual dependence and no trend in p,
the frost-free probability, or crop failure probability.
Poisson distributions
Poisson distributions are often used to describe the number of
occurrences of a rare event. For example
The number of tropical cyclones in a season
The number of occasions in a season when river levels
exceed a certain value
The main assumptions are that events occur
at random (the occurrence of an event doesnt change the

probability of it happening again)
at a constant rate
Poisson distributions also arise as approximations to

binomials when n is large and p is small.
Poisson distributions an example

Suppose that we can assume that the number of
cyclones, X, in a particular area in a season has a
Poisson distribution with a mean (average) of 3.
Then P(X=0) = 0.05, P(X=1) = 0.15, P(X=2) =
0.22, P(X=3) = 0.22, P(X=4) = 0.17, P(X=5) =
0.10, Note:
There is no upper limit to X, unlike the binomial where

the upper limit is n.
Assuming a constant rate of occurrence, the number of
cyclones in 2 seasons would also have a Poisson
distribution, but with mean 6.
Normal (Gaussian) distributions

Normal (also known as Gaussian) distributions are
by far the most commonly used family of
continuous distributions.
They are bell-shaped see Slide 20 - and are
indexed by two parameters:
The mean the distribution is symmetric about this

value
The standard deviation this determines the spread
of the distribution. Roughly 2/3 of the distribution lies
within 1 standard deviation of the mean, and 95%
within 2 standard deviations.
Deviations from normality - skewness

Some variables deviate from normality because their
distributions are symmetric but too flat or too longtailed.
A more common type of deviation is skewness, where one tail
of the distribution is much longer than the other.
Positive skewness, as illustrated in the next Slide is most
common it occurs for windspeeds, and for rainfall
amounts.
Negatively-skewed distrbutions with longer tails to the left
sometimes occur, for example surface pressure.
A positively-skewed Weibull distribution
0.3
f(x)
0.2
0.1
0.0
0
x
10
Families of skewed distributions

There are several families of skewed distributions,
including Weibull, gamma and lognormal. Each
family has 2 or more parameters which can be
varied to fit a variety of shapes.
One particular family (strictly 3 families) consists of
so-called extreme value distributions. As the name
suggests, these can be used to model extremes
over a period, for example, maximum windspeed,
minimum temperature, greatest 24-hr. rainfall,
highest flood
Other probability distributions

We have sketched a few of the main probability
distributions, but there are many others. Examples
which dont fit standard patterns include
Proportion of sky covered by cloud may have large

probability values near 0 and 1, with lower probabilities
in between U-shaped rather than bell-shaped
Daily rainfall is neither (purely) discrete, nor
continuous. Positive values are continuous, but there is
also a non-zero (discrete) probability of taking the
value zero.

Probability - Probability Distributions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability - Probability Distributions

Uploaded by

Copyright:

Available Formats

Probability and Probability

Indian Institute of Quantitative Finance

Conditional probability and

It is likely that knowledge that A has occurred will change

Conditional probability - tornado

Use the frequencies in the table to estimate

Indian Institute of Quantitative Finance

Conditional probability - tornado

Knowledge of the forecast changes P(T). F and T are

P(F|T) = 28/51 = 0.5490

P(T|F), P(F|T) are often confused but are different

Conditional probability - tornado

This is Bayes Theorem, though in the usual statement

Continuous and discrete random

counts, such as numbers of cyclones

Indian Institute of Quantitative Finance

Discrete probability distributions

Indian Institute of Quantitative Finance

Continuous probability distributions

Families of probability distributions

Families of discrete distributions

Indian Institute of Quantitative Finance

The assumptions of independence and constant p

Binomial distributions - examples

Indian Institute of Quantitative Finance

at random (the occurrence of an event doesnt change the

Poisson distributions also arise as approximations to

Poisson distributions an example

There is no upper limit to X, unlike the binomial where

Normal (Gaussian) distributions

The mean the distribution is symmetric about this

Deviations from normality - skewness

A positively-skewed Weibull distribution

Families of skewed distributions

Other probability distributions

Proportion of sky covered by cloud may have large

You might also like