Probability Distribution

Introduction to Probability and Distributions
by
Hrishikesh Khaladkar
Department of Mathematics
Fergusson College,Pune
May 20, 2018

Classical Definition of Probability
If a random experiment results in N exhaustive, mutually exclusive

and equally likely outcomes out of which m are favorable to the
happening of the event A then the Probability of occurrence of A
m
is given by P(A) =
n
exhaustive cases:are the total possible outcomes of a random
experiment
mutually exclusive cases:if the happening of any one of them
excludes the happening of all others in the same experiment
equally likely outcomes: if none of them is expected to occur
in the preference of other
favorable events: the number of outcomes which entail (result
in) the happening of the event
Axiomatic Definition of Probability
Given a sample space of a random experiment, the probability of

an occurence of an event A is defined as a function P(A) satisfying
the following axioms P : R → [0, 1]
P(A) is real and non-negative that is P(A) ≥ 0 (Non
negativity)
P(S) = 1 (Axiom of certainity)
n
[ Xn
P( Ai ) = P(Ai ) (Axiom of additivity)
i=1 i=1
Related Results
0 ≤ P(A) ≤ 1 for any event A
Complement Principle
P(A) + P(Ā) = 1 where Ā denotes the complement of A
Inclusion Exclusion Principle
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Mutually exclusive events:
If events are mutually exclusive then A ∩ B = φ Hence
P(A ∪ B) = P(A) + P(B)
Conditional Probability
If the event A occurs followed by the event B then the
conditional
probability is given by
B P(A ∩ B) A P(A ∩ B)
P = . Similarly P =
A P(A) B P(B)
Independent Events
A
If A is not dependent on B then P = P(A) and
B
B
P = P(B). Therefore P(A ∩ B) = P(A)P(B)
A
Baye’s Theorem
If an event A can only occur in conjunction with one of mutually

exclusive and exhaustive events E1 , E2 , ....En and if A actually
happens the the probability that it was preceded by the particular
event Ei (i=1,2.....n) is given by
A
P(Ei )
Ei P(A ∩ Ei ) Ei
P = =
A Pn A Pn A
i=1 P(Ei )P i=1 P(Ei )P
Ei Ei
Special
Case
B P(A ∩ B)
P =
A P(A)
His theorem has given rise to a New Branch which called Bayesian
Statistics having immense Applications including the Bayesian
Classifier in Machine Learning.
Random Variables
Random Variable A random variable may be defined as a

real valued function taking values on the real line R
Discrete Random Variable If the random variable assumes
only a finite or countably infinite set of values then it is called
as a Discrete Random Variable
Examples include: No heads appearing when a coin is tossed,
No of accidents occuring on a road etc..
Continuous Random Variable If the random variable
assumes only a infinite and uncountable set of values then it is
called as a Continuous Random Variable
Examples include : height, weight, temperature fluctuations
etc...
Probability Mass Function
Let X be a Discrete Random Variable.

With each value X we associate a number pi = P(X = xi ) which is
known as probability of xi .
If it follows the following conditions
pi = P(X = xi ) ≥ 0
P
pi = 1
Then the function0pi = P(X = xi ) is called as probability mass
function of the variable X
Probability Density Function
Let X be a Continuous Random Variable taking values over an

interval [a, b]
With each value X we associate a number P(X = x) which is
known as probability of the random variable X .
If it follows the following conditions
p(x) ≥ 0
For any two distinct points c and d in [a, b]
P(c ≤ x ≤ d) = Area of the curve between x = c and x = d
P(a ≤ x ≤ b) = 1
Then the function0 P(X = x) is called as probability density
function of the variable X
Distribution Function
If X is a discrete random variable with probability mass

function P(X = x) then the distribution function F (X ) is
defined as
F (X = x) = p(X ≤ x) = p(X = 1) + p(X = 2).... + p(X = x)
If X is a continuous random variable with probability density
function P(X = x) then the distribution function F (X ) is
defined as Z x
F (X = x) = p(X ≤ x) = p(x)dx
a
Expected Value or the Mean
Let X be a Random Variable then the Expected value is a

mathematical function.
P
E (X ) = xP(X = x) for discrete distributions
Z b
E (X ) = xp(x)dx for continuous distributions
a
Properties
E (c) = c for a constant c
E (aX + b) = aE (X ) + b Linearity
E (XY ) = E (X )E (Y ) Multiplicative
Variance or Standard deviation
Let X be a Random Variable then the variance of a random

variable X is defined as
σ 2 = E [X − E (X )]2 = E (X 2 ) − E (X )2
Properties
Var (c) = 0 for a constant c
Var (aX + b) = a2 Var (X )
Var (aX − b) = a2 Var (X )
Standard deviation is defined as the positive square root of
Variance
Variance or Standard deviation
Let X be a Random Variable then the variance of a random

variable X is defined as
σ 2 = E [X − E (X )]2 = E (X 2 ) − E (X )2
Properties
Var (c) = 0 for a constant c
Var (aX + b) = a2 Var (X )
Var (aX − b) = a2 Var (X )
Standard deviation is defined as the positive square root of
Variance
Moments
Let X be a random variable (continuous or discrete) the r th

moment about any point A is given by
µr = E (X − A)r
In particular if the point A = X̄ = E (X ) then µr is called as
the r th central moment.
0
Similarly if the point A = 0 then µr is called as the r th raw
moment.
Significances of Moments
0
µ1 = X̄ is the mean of the distribution
µ2 = σ 2 gives the variance of the distribution
µ3 helps us define the skewness of the distribution
µ4 helps us define the kurtosis of the distribution.
Binomial Distribution
What is Binomial Distribution? If X is a random variable which

denotes the number of successes in n trials satisfying the
conditions
n : the number of trials is finite
Each trial results in two mutually exclusive outcomes termed
as failures and success
Trials are independent
p: the probability of success is constant in any trial so is
q=1-p
then the probability of r successes is given by
P(X = r ) = nr p r q n−r where r = 0, 1, 2...n

We say that X → B(n, p)

Various parameters for Binomial Distribution
Mean of the distribution : np

Variance: npq
√
Standard Deviation: npq
Mode
1 X = k and X = k 1 if (n + 1)p is an integer bimodal
2 X = k when (n + 1)p is not an integer
Applications of Binomial Distribution
The distribution arises when the underlying events have two

possible outcomes the chances of which remain constant.
The number of defective found it samples of size n from a
stable production process is a binomial variable.
Its use in Genetic Engineering arises because the inheritance
of the biological characteristics depends on the genes that
occur in pairs
Poison Distribution
What is Poison Distribution?

A random variable X is said to follow a Poison Distribution under
the following conditions
n: the number of trials is indefinitely Large n → ∞
p: the constant probability of success of each trial is
indefinitely small p → 0
np = λ is finite
then probability distribution of X is given by
e −λ λr
P(X = r ) = where r = 0, 1, 2....
r!
We say X → P(λ)
Note that Poison is the limiting case of Binomial distribution
Various Parameters of Poison Distribution
Mean of the distribution :λ

Variance:λ
Standard Deviation: λ
Mode X = k and X = k 1 if (n + 1)p is an integer bimodal
X = k when (n + 1)p is not an integer
Applications of Poison Distribution
This distribution is applied to rare situations where the

probability of occurence decreases with respect to time.
The number of fatal automobile accidents per month.
The number of typing errors per page.
The number of atoms disintegrating per second from a radio
active material.
The number of Bomb hits on a square mile of London in 1944.
The number of defects in a manufactured article.
Hypergeometric Distribution
Let X be a discrete random variable which denotes the number of

success obtained out of population of size N.
A poulation of size N
A simple random sample is drawn without replacement of size
n.
k possess a certain characterstic in the population (like k
success)
x denotes the number of success within the sample of size n
The X follows Hypergeometric distribution if the probability mass
function is given by
k N−k
x n−k
P(X = x) = N
where x = 0, 1, ...n
n
We say that X → H(N, n, k)
This mostly arises in sampling without replacement from a finite
population whose elements are classified in to two categories
Introduction to Normal Distribution
If X is a continuous random variable following normal distribution

with mean µ and standard deviation σ then its
−(x − µ)2
1
p(x) = √ e 2σ 2 where −∞ < x < ∞
σ 2π
Discovered by English Mathematician De-Moivre who
obtained the Mathematical Equation while dealing with
problems arising in the game of chance.
But heavily used by Gauss to describe the theory of accidental
errors of measurements involved in the calculation of orbits of
planets and asteroids. Hence also known as Gaussian
Distribution.
We say that X → N(µ, σ 2 )
Standard Normal Variate
If X is a continuous random variable following normal distribution

with mean µ and standard deviation σ then define a new variable
X −µ
Z= . This is called as a standard Normal variate
σ
Special Features

X −µ 1 1
E (Z ) = E = E (X − µ) = (E (X ) − E (µ)) = 0
σ σ σ

X −µ 1 1
Var (Z ) = Var = 2 Var (X − µ) = 2 Var (X ) = 1
σ σ σ
Hence the standard normal variate is a normal variate with mean 0
and standard deviation 1.That is Z → N(0, 1)
−z 2
1
Hence the pdf is p(z) = √ e 2
2π
There are probability tables available for Standard Normal
Variate with various Level of Significance
Why standard Normal Variate

Features of Normal distribution
The graph of p(x) is a bell shaped curve symmetrical about

the value X = µ or Z = 0
Since the distribution is symmetric mean, mode and median
coincide
As x increases numerically (on either sides) the value of p(x)
decreases rapidly.
1
The maximum value of [p(x)]max = √ which means that
σ 2π
it is inversely proportional to standard deviation.For large σ ,
p(x) decreases (which means curve tends to flatten) and for
small values of σ, p(x) increases(which means that the curve
has a sharp peak).
The X axis is an asymptote to the curve
Applications of Normal Distribution
If X is a normal variate with mean µ and variance σ 2 then

P(µ − 3σ < X < µ + 3σ) = P(3 < Z < +3) = 0.9973
P(|Z | > 3) = 1 − 0.9973 = 0.0027
That means the probability of standard normal variate going
the limits +3 or 3 is practically zero.It forms a basis for the
entire theory of Large Samples
Most of the discrete probability distributions tend to normal
as the number of trials increases indefinitely.
The entire theory of Small sample tests has a based
assumption that the parent population from which the sample
has been drawn follows normal distribution.
The Central Limit Theorem
Exponentail Distribution
A Continuous random variable X is said to follow Exponential

Distribution (or Negative Exponential) if the probability density
function is of the form. p(x) = λe −λx where x ≥ 0, λ > 0
This is a special case of Gamma Distribution
If λ = 1 then the distribution p(x) = e −x is called as standard
Exponential Distribution.
The mathematics of Exponential Distribution is often simple in
nature and so it is possible to obtain explicit formulas in terms
of elementary functions without trouble some Quadratures.
Hence Models constructed from exponential variables are used
as approximate representation for other models
We say X → Exp(λ)
Exponentail Distribution
Properties of Exponentail Distribution
The probability density curve is a J shaped curve

1
Mean of the Distribution is :
λ
1
Standard deviation of the distribution is:
λ
Suppose F (x) denotes the distribution function P(X ≤ x)
then P(x > x) = 1 − F (x) denotes the survival function in
biomedical applications and reliability function in case of
industrial applications.
If X → Exp(λ) then P(X < t1 + t2 /X > t1 ) = P(X < t2 )
where t1 > 0, t2 > 0. This is called as the Lack of memory
property.Conversly if X is a continuous random variable
satisfying this property then X follows Exponential.
p(x)
In reliability theory is called as the hazard rate.
1 − F (x)
Applications of Exponentail Distribution

Random process taking place over time under certain
assumptions follow Poison distribution. In such cases the
waiting time between such processess follows Exponential
distribution. For example:
trucks arriving per hour at a ware house follow Poison process
under certain assumptions. In such cases the waiting time
between two successive arrival follows Exponential Distribution.
Similarly user Log-ons in a large computer network can be
follow Poison distribution. In such cases the waiting time
between two successive logons follows Exponential
Some kind of electrical components like fuses, safety
valves,glass wares, transistors do not experience the aging
process.Hence there life time is reasonably assumed to be
exponential. Also one of the interesting facts that probability
that the component will survive t units more given that it has
already survived s time units is same as the probability that a
newly installed component will survive more than t units. This
indicates the Lack of Memory property
Chi-Square Distribution
X −µ
Given a standard normal variate Z = is a Chi-Square
σ
2
variate χ variate with 1 degree of freedom.
Hence if X1 , X2 ....Xk are k independent random variates with
means µ1 , µ2 ....µk and standard deviations σ1 , σ2 ...σk then the
variate
X1 − µ1 2 X2 − µ2 2 Xk − µk 2

2
χ = + + .... is called as
σ1 σ2 σk
a Chi-Square variate with k degrees of freedom.
Its pdf is given by
1 −χ2 k
p(x = χ2 ) = k e 2 (χ2 ) 2 −1 where 0 < χ2 < ∞
2 2 Γ k2
We say that X → χ2 (n)
Chi-Square Distribution
Note that for higher degrees of freedom this distribution

goes to normal
t Distribution
Given that X → N(0, 1) and Y → χ2 with n degrees of freedom

X
then the variable t defined by t = q is said to follow t
Y
n
distribution with parmeter n or (t → tn )
The pdf of the distribution is given by
1
p(t) = n+1 where −∞ < t < ∞
√ t2 2

1 n
nB , 1+
2 2 n
We say t → tn
t Distribution
Note that for higher degrees of freedom this distribution

goes to normal
F Distribution
Suppose X and Y are independent χ2 variates with n1 and n2

X
n1
degrees of freedom respectively then F = Y
is said to follow F
n2
distribution with n1 and n2 degrees of freedom. F → F (n1 , n2 )
The pdf of the distribution is given by
n1
n1 2
n1
n2 F 2 −1
p(F ) = n n
1 2 n1 +n2 where 0 < F < ∞
B , n1 2
2 2 1+ F
n2
We say that F → F (n1 , n2 )
F Distribution
Note that for higher degrees of freedom for n1 and n2 this

distribution goes to normal
A short note
All the three distributions t, χ2 and F distributions are

interelated.
All the distributions are used as Sampling distributions to
infer various population parameters from samples drawn
having size less than 30 (small samples)
Thank You!!

Probability Distribution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability Distribution

Uploaded by

Copyright:

Available Formats

Introduction to Probability and Distributions

Introduction to Probability and Distributions

May 20, 2018

Classical Definition of Probability

If a random experiment results in N exhaustive, mutually exclusive

Axiomatic Definition of Probability

Given a sample space of a random experiment, the probability of

If an event A can only occur in conjunction with one of mutually

Random Variable A random variable may be defined as a

Probability Mass Function

Let X be a Discrete Random Variable.

Probability Density Function

Let X be a Continuous Random Variable taking values over an

If X is a discrete random variable with probability mass

Expected Value or the Mean

Let X be a Random Variable then the Expected value is a

Variance or Standard deviation

Let X be a Random Variable then the variance of a random

Variance or Standard deviation

Let X be a Random Variable then the variance of a random

Let X be a random variable (continuous or discrete) the r th

What is Binomial Distribution? If X is a random variable which

We say that X → B(n, p)

Various parameters for Binomial Distribution

Mean of the distribution : np

Applications of Binomial Distribution

The distribution arises when the underlying events have two

What is Poison Distribution?

Various Parameters of Poison Distribution

Mean of the distribution :λ

Applications of Poison Distribution

This distribution is applied to rare situations where the

Let X be a discrete random variable which denotes the number of

Introduction to Normal Distribution

If X is a continuous random variable following normal distribution

Standard Normal Variate

If X is a continuous random variable following normal distribution

Why standard Normal Variate

Features of Normal distribution

The graph of p(x) is a bell shaped curve symmetrical about

Applications of Normal Distribution

If X is a normal variate with mean µ and variance σ 2 then

A Continuous random variable X is said to follow Exponential

Properties of Exponentail Distribution

The probability density curve is a J shaped curve

Applications of Exponentail Distribution

Note that for higher degrees of freedom this distribution

Given that X → N(0, 1) and Y → χ2 with n degrees of freedom

Note that for higher degrees of freedom this distribution

Suppose X and Y are independent χ2 variates with n1 and n2

Note that for higher degrees of freedom for n1 and n2 this

All the three distributions t, χ2 and F distributions are

You might also like