Professional Documents
Culture Documents
Theory of Probability
3
Uncertainty and Inductive Reasoning
4
Degree of Uncertainty
5
Degree of Uncertainty
6
Uncertainty Objective Approach
Example:
7
Uncertainty Objective Approach
8
Uncertainty Subjective Approach
In the subjective approach there is no question of repetition
of observations.
Here uncertainty only means absence of knowledge about
the evidence and extended evidence, before the generating
operations are performed.
Because of this, the scope for induction is somewhat wider
in the subjective than in the objective approach.
Example:
1. Number of working hours that might be wasted due to
contact labour strike in next six months.
2. Exchange rate at the next morning
9
Meaning of Probability
Various Aspects
Meaning of probability
11
Meaning of probability
For any set of interest the probability that the uncertain evidence
will belong to it is identified with the corresponding idealized
long-term relative frequency.
12
Meaning of probability
13
Meaning of probability
14
Gambling and Games of Chances
A Fascinating History of Development of
Theory of Probability
Cardanoan unrecognized pioneer
Gerolamo Cardano
16
Cardanoan unrecognized pioneer
He wrote a book entitled Liber de Ludo Aleae (The Book on Games
of Chance) around 1564
Cardano was jailed briefly for heresy (in part for casting the
horoscope of Jesus).
17
Basic Ideas and Rules of Probability Theory:
Conceptualized by Cardano
18
Basic Ideas and Rules of Probability Theory:
Conceptualized by Cardano
19
Basic Ideas and Rules of Probability Theory:
Conceptualized by Cardano
5. Cardano also correctly formulates the product rule for computing
the chance of the simultaneous occurrence of events defined for
independent trials Details will be discussed later.
6. In the case of throwing two dice the odds on getting at least one ace,
deuce, or trey are 3:1.
Cardano states that if the player who wants an ace, deuce, or trey
wagers three ducats [a standard unit of currency at that time] and
the other player one, then the former would win three times and
would gain three ducats and the other once and would win three
ducats; therefore in the circuit of four throws [impliedly in the long
run] they would always be equal.
20
Galileo Galilei Sought to resolve a
puzzle about a dice game
Galileo Galilei
15 February 1564 8 January 1642
21
Basic Ideas and Rules of Probability Theory:
Conceptualized by Galileo
Galileo pointed out that there is a very simple explanation, namely that
some numbers are more easily and more frequently made than others,
which depends on their being able to be made up with more variety of
numbers.
22
Probability is officially bornPascal
and Fermat
Blaise Pascal
19 June 1623 19 August 1662
23
Probability is officially bornPascal
and Fermat
Pierre de Fermat
17 August 1601 (or 1607) 12
January 1665
24
First Published Book on Probability
Christiaan Huygens
14 April 1629 8 July 1695
25
Applications: Probability in Finance :
Consider only two players, they alternate moves, each is
immediately informed of the others moves, and one or the
other wins.
26
Probability in Finance :
Consider a straightforward but rigorous framework for elaboration, with no
extraneous mathematical or philosophical baggage, of two ideas that are
fundamental to both probability and finance:
27
Probability in Marketing :
28
Elementary Calculus
Probability:
Classical and Frequentist
Approaches
Calculus of Probability :
Connections with Set Theory
30
Calculus of Probability :
Connections with Set Theory
Set Theory Probability Theory Notations
Intersection of sets and Joint occurrences of events
and
31
Classical Definition of Probability :
As in Thorie analytique des probabilits
by Pierre-Simon Laplace
32
Simple Examples
33
Frequentist Definition of Probability :
34
Frequentist Definition of Probability :
35
Frequentist Definition of Probability :
For any given event, only one of two possibilities may hold: it
occurs or it does not.
36
Frequentist Definition of Probability :
37
Frequentist Definition of Probability :
( ) = lim .
38
Travelers Choices
39
Other Applications
40
Combinatorics :
Arrangement of r balls in n cells
41
Combinatorics :
Arrangement of r balls in n cells
Exclusion principle Exclusion principle not
followed followed
Balls are () =
!
distinguishable !
(Maxwell-Boltzman
0 otherwise Statistics)
Balls are +1
indistinguishable
(FermiDirac (Bose-Einstein Statistics)
statistics) Special Case:
No cell remain empty:
1
1
42
Application
43
Industrial Implications
If in a coal mines, 12 accidents occur in each year, then
practically all year will contain months with two or more
accidents. The probability that all months will have one
accident each is only 0.0000537.
44
Extensions
45
More Example
46
Application In Industrial Quality Control
Items are sampled from a collection of items and inspected for
defects. Assume that there are n defective items in the lot of
items. What is the probability of sampling defective items
out of items?
Problems of these types lead to genesis of hypergeometric
distribution.
47
Estimating Population Size of Fish in a
Lake [capture-recapture ]
Consider the following experiment in an attempt to estimate
the number of fish in a lake. First, fishes are captures,
marked, and released. At a later time, fishes are caught with
of them bearing the mark of the original capture. Assuming
the size of the population of fishes is , the probability of
getting marked fishes in the second capture is
.
In this case, (, , ) are known but is unknown. We can estimate or
construct confidence intervals using the likelihood (probability of
observed data as a function of the unknown parameter ).
For example, if = 100, = = 1000 - we have approximately 93%
confidence interval that belongs to (8500; 12000).
48
Criticism of
Classical Definition of Probability :
Mathematicians find the definition to be circular.
The probability for a "fair" coin is... A "fair" coin is defined by a
probability of...
51
More on - algebra
52
Axiomatic Definition
of Probability
and
Probability Laws
Axiomatic Definition of Probability
[By Andrey Kolmogorov]
Probability of an event A, denoted by P(A) is a set function (also
called a measure ) with sample space and -field (also called event
space) satisfying the following axioms:
Axiom of nonnegativity: The probability of an event is a non
negative real number:
, 0
Axiom of Unity: the probability that at least one of the elementary
events in the entire sample space will occur is 1. More specifically,
there are no elementary events outside the sample space.
= 1
Axiom of Countable Additivity: For any countable sequence of
disjoint (synonymous with mutually exclusive) events 1 , 2
, P(
=1 ) = =1 ( )
54
Andrey Kolmogorov
55
Important Results Follows From
The Probability axioms
Result-1: For the impossible event , we have necessarily P = 0.
= ( )
=1 =1
Result-2.A. If , = 1,2 , be exhaustive and mutually
exclusive events in , then
( ) = 1
Result-2.B. Rule for Complementary Probability of any event A :
For any event , ( )= 1 .
Important Results - Continued
104+411 8
8!10!3! 8.7.3 14
Check: 104
10+41 = 6
13 = = = = 0.0979
10 10 6!2!13! 11.12.13 143
Example
A large shopping complex has 15 entry gates. Usually, one security
personnel is deployed to each of these gates. Security personnel can
usually chat with their colleagues deployed in the adjacent (both right
and left) gates. The personnel in 1st and 15th gates will be able talk
with only one of their colleagues. It was observed from past CCTV
footages that two personnel, say, and , whenever deployed in
adjacent gates gossips more and do not take the job seriously!
Management has ordered the chief-security officer that and
should be so deployed that there should be 10 other personnel in
between them. On one day, the chief-security officer was absent and
another person who had no idea about the order, allotted 15 personnel
in 15 gates at random. What is the probability that the requirement
will be met even in that case?
Hint Answer
2!10!3! .4 13
10
Required Probability =
15!
When Exact Probability is Untraceable
.
=1 =1
1
=1 =1
( 1)
=1 =1
Example
= +
=1 =1 <=1 <<=1
1
+ 1 1 2
69
Definition of Conditional Probability
Consider the probability space , , . The conditional
probability of an event given that another event also
belong to same has occurred, denoted as , is defined
as by
( )
=
()
provided () 0.
If = 0, = 0.
The above definition satisfies all probability axioms discussed
earlier [Please Check by yourself]
70
Justification of the Definition of Conditional
Probability in the Light of Three Axioms
(i) > 0 by definition and 0 by axiom of nonnegativity.
()
Therefore, = 0.
()
(ii) If , | .
If , = ; Therefore P = .
Hence,
= = .
() ()
= . .
73
Theorem of Compound Probability
1 2
= 1 . 2 |1 . . ( |1 2 1 )
74
Probability of Complementary Event
with Conditioning Event
So we have
| = 1 .
75
Law of independence
How to interpret the equation:
| = ?
It shows that As occurrence has had no impact on B. We say
then that B is independent of A.
78
Testing Independence of Two Events
1 1
Then = = () and = . So,
2 4
= . .
79
Testing Independence of Two Events
80
Difference between Mutually Exclusive
Events and Independent Events
Note carefully that, if A and B are mutually exclusive, then
= 0. From the definition of conditional probability,
we see that
= =
81
Difference between Pairwise Independence
and Complete Independence of Events
If A, B and C are three events in , they will be pairwise
independent if
=
=
=
82
Some Final Remarks on
Achievements and Failures of
Gerolamo Cardano
83
Cardanoan unrecognized pioneer
If the three dice are thrown thrice, Cardano correctly gets that the
odds for getting the event at least once is a little less than 1 to 12
84
Two problems : Cardano discussed but
failed to solve correctly
1. Problem of minimum number of trials: What should be the
minimum value of r, the number of throws of two dice,
which would ensure at least an even chance for the
appearance of one or more double sixes?
For Brainstorming
85
Some Important Theorems
Theorem of Total Probability
87
Theorem of Total Probability
88
Theorem of Total Probability
89
Example: Theorem of Total Probability
90
Bayes' Theorem
91
Laplace form of Bayes Theorem
92
Rev. Thomas Bayes
93
Application of Bayes Theorem
94
Bayesian (or epistemological) interpretation
of the Theorem
where
= (1 2 ) .
11 <,2 <,<
Example-2: Both the bus and you get to the bus stop at random
times between 12noon and 1pm. When the bus arrives, it waits
for 5 minutes before leaving. When you arrive, you wait for 20
minutes before hiring a cab if the bus doesn't come. What is the
probability that you catch the bus?
An Extension of Classical Definition to
Geometrical Probability
=
552 402
602 103
2 2
= =
602 288
An Example from the Book:
Bayesian Method in Finance
[Authors: S. T. Rachev; J. S. J. Hsu; B. S. Bagasheva and F. J. Fabozzi]
. (|)
P =
. + . (| )
0.75 0.4
=
0.75 0.4 + 0.35 0.6
0.3 0.3
= = = 0.5882.
0.3+0.21 0.51
1. Model Selection
2. Bayes Classification
Bayesian Method in Information Systems
(For Network Security and Spam Filtering)
. |
=
()
600 300
. | 300
= = 1000 600
400 = = 0.75
() 400
1000
600 90
. | 90
= = 1000 600
100 = = 0.90
() 100
1000
Bayesian Method in Information Systems
(For Network Security and Spam Filtering)
Our prior probability of spam (given the training data) is 0.6, and if
we see a message containing the word free we bump that up to
0.75 and if we see Viagra we bump it up to 0.90.
There are several problems with this equation. The 1st is the
denominator: usually one does not going to record and train on
all subsets of words (let's stipulate that), so the probability of
Viagra co-occurring with free is unknown. The same problem
is on the numerator, where one would need to know the
probability of that pair of terms co-occurring in a spam message.
You may or may not agree with the assumption, but that's what
it means.
+ = 1.
Therefore,
. |
( )
. |
+ = 1.
( )
Bayesian Method in Information Systems
(For Network Security and Spam Filtering)
This gives;
= . |
+ . | .
This replaces the calculation of the joint probability (
Bayesian Method in Information Systems
(For Network Security and Spam Filtering)
This, then, is the desired denominator for our probability
calculation. Note that the 1st term is the same as our
numerator, the other term is the analogous calculation
conditioned on ham rather than spam. The final formula, then,
for two pieces of evidence is:
. | . |
=
{ . | . |
+ . | . | }
In Orange County, 51% of the adults are males. (and assume that
the other 49% are females) One adult is randomly selected for a
survey involving credit card usage.
( | ) =
+
0.510.095
= = 0.85329.
0.510.095+0.490.017
Application of Bayes Rule in Market Survey
(|)
=
+ (|)+ (|)
0.8 0.04
= = 0.7033
0.8 0.04 + 0.15 0.06 + 0.05 0.09
Application of Bayes Rule in Traffic
Management and Crime Investigation
A certain town has two taxi companies: Blue Birds, whose cabs
are blue, and Night Owls, whose cabs are black. Blue Birds
has 125 taxis in its fleet, and Night Owls has 375. Late one
night, there is a hit-and-run accident involving a taxi. The
town's 500 taxis were all on the streets at the time of the
accident. A witness saw the accident and claims that a blue taxi
was involved. At the request of the police, the witness
undergoes a vision test under conditions similar to those on the
night in question. Presented repeatedly with a blue taxi and a
black taxi, in random order, he shows he can successfully
identify the color of the taxi 9 times out of 10. Which company
is more likely to have been involved in the accident?
Two Problems of Theoretical Nature
+ +
2 + + + 3( )
= 1 ,, = 1
11 <<
By Poincare's theorem The probability that at least one cell is
empty is given by 1 2 + +
More On The Classical Occupancy Problem
, = 1 0 ,
+
= (1) 1
0
Urn Models for Aftereffect.
Examples:
The return on an investment in a span (period) of one-year;
The closing price of a stock in NSE;
The number of customers entering a shopping complex
The sales volume of a store on a particular day
The turnover rate at your organization next year
Types of Random Variables
For Example,
You may win a bid or lose
After flipping, coin may show head or a tail
A customer can be male or female
We Often assign numbers such as 0 and 1 to the possible
possible outcomes in such cases
Probability Distribution
i. 0 1for all
ii. = 1; being set of all possible values of .
= , = 0,1,2, , .
0 1, = 1 ,
In practice,
min(, ) and max(0, ).
The Hypergeometric Distribution -
from Urn Problem
=
, = 0,1,2, , .
0 1, = 1 ,
Special case: = 1.
= (1 )1 , = 0,1. 0 1.
What is the probability that he will find the ideal candidate in trials?
1
= = = 1 for = 1, 2, 3, .
Suppose a fair die is thrown repeatedly until the first time a "1"
appears. The probability distribution of the number of times it is
thrown is supported on the infinite set { 1, 2, 3, ... } and is a
geometric distribution with = 1/6
Problem-5
1
(; , ) = [ = ] = 1
1
for = , + 1, + 2,
Discrete Uniform Distribution
II. The probability that on a particular day not more than 3 CSR will
remain absent:
= 3 = =0 + =1 + =2 + =3
25 25
= 0.1 0 (0.9)25 + 0.1 1 (0.9)24
0 1
25 25
+ 0.1 2 (0.9)23 + 0.1 3 (0.9)22
2 3
= 0.0717898 + 0.1994161 + 0.2658881 + 0.2264973 = 0.7635913.
Solution To Part III and IV.
iii. The probability that on a particular day 3 or more CSR will remain
absent:
3 =1 2 =1 =0 =1 =2
= 1 0.0717898 + 0.1994161 + 0.2658881 = 0.462906.
Many old Soviet literature uses " < ", so that the forth property
change left continuity instead of right continuity.
By definition
()
= .
1 ()
Inverse distribution function
(quantile function)
We shall see rigorous use of this when we shall study Testing of
Hypothesis in QT-II.
Also note that we shall often use 0.5th , 1st , 2.5th , 5th , 95th, 97.5th,
99th and 99.5th percentile points in Statistical Inference in QT-II.
Solution to Part VII
Quartile Deviation is a measure of dispersion or variability in the
probability distribution.
These concepts can easily be used when we have a set of raw data
and/or frequency distribution.
Quartile Based Skewness Measure
Skewness measures the degree of asymmetry in the probability
distribution or data as the case may be.
As a result, if there are are some outliers in the tails , these measures
are highly robust and efficient in the sense they are not influenced
by the presence of outlier.
This remind us that we must study certain measures that are based
on entire probability distribution or the data as the case may be.
How to find Average Absenteeism
=
=1
For binomial distribution, it is:
= =1 1 = [See Board for Proof]
=
=1
Note that, according to relative frequency approach of probability tends to
probability of , which may be given by .
=
=1
Problems of Absenteeism (Contd.)
Suppose, there are 10 more call center agents (CSA) who handle the
responsibilities of sales promotion and marketing through outbound
calls. On a particular day their absenteeism follows a binomial
distribution with parameter 0.08.
The probability that on a particular day, less than 5 CSA will remain
absent = < 5 = [ 4].
In this context, we have assumed that at least one CSA will remain
present.
They directed the manager that 25 CSR must be used 24x7 even
at the cost of sales promotion, if necessary.
The probability that on a particular day there will be no one for sales
promotions and marketing if and only if sum of the number of
absent CSR and CSA on a particular day is 10 or more.
To evaluate: P + 10 .
P + 10
= 0, = 10 + 1, = 9 + 2, = 8 +
+ 10, = 0
= 0 = 10 + 1 = 9 + 2 = 8
+ + 10 = 0
(We assume that X and Y are independent random variable where
we can apply rule of Multiplication of probability)
We can easily evaluate this using Calculator. You can check that the
probability is 0.001103982.
More Problems with Call Center
Management
Further suppose that at any particular moment number of incoming
calls for the CSR follows a Poisson distribution with rate (average) 8.
Customers do not have to wait if at least one of the 25 (assuming no
absence) CSR is free as the call will automatically go to a free CSR.
What is the probability that at any point of time just one customer
has to wait?
What is the probability that at any point of time more than half of the
CSR will remain free?
Average (Expectation) in Context of
Poisson Distribution
=
=0
provided the sum is finite. In fact, the sum exists iff
|| < .
Further at any point of time more than half of the CSR will remain
free if customers availing CSR services () at that time is not more
than 12.
~ , , Var = np 1 p
~ (), Var =
Examples
XVI. What is the probability that at any point of time just one customer
has to wait?
XVIII.What is the probability that at any point of time more than half of
the CSR will remain free?
Average (Expectation) in Context of
Poisson Distribution
=
=0
provided the sum is finite. In fact, the sum exists iff
|| < .
() = ( ) [ = ] = ( )
=0 =0
provided the |()| < .
Required Probability:
= 1 = = 26 =2.513997x 107
Solution to Parts XVII and XVIII
of Call Center Problems
At any point of time at least three customers have to wait iff
3
Required Probability:
[ 3] = [ 28] = 1 [ 27] = 2.925614 108
Further at any point of time more than half of the CSR will
remain free if customers availing CSR services () at that time
is not more than 12.
~ , , Var = np 1 p
~ (), Var =
See Board for proofs.
Examples- Problems XIX and XX
XX. The standard deviation (SD) of the random variable , denoting the
number of Customers trying to avail CSR service at certain time,
where, ~ (8) is:
8 = 2.828.
More Problems : XXI and XXIII
Suppose ~ (, ).
XXV. On a given day, it was found that the total number of absentees
two groups taken together is 5. What is the probability that 4 of
them are CSRs?
1
Check that = .
2
1 1
= 1 = 1= 1= .
= 1 =
Some General rules for
Expectation and Variance
Expectation of any constant c is the same constant. = .
For any two random variable and , sum law of expectation states:
+ = + = + , .
Always = 0. Similarly = 0.
Let denote the number of candidates need to interviewed to fill the i-th
vacancy.
As a consequence:
E = = ( ) = .
=1 =1
= 25 () = 42 = 16
Further; = 0.3 6
Therefore,
= 0.3 6 = 0.3 25 6 = 1.5
and
= 0.32 = 0.09 16 = 1.44
Consequently, = = 1.2
The most common interpretation is that you are willing to risk Rs. 4
against someone else Rs. 1 on the outcome of the GST bill pass.
Specifically, if the GST bill pass the floor test in upper house, you get Rs.
1 and if not you pay Rs. 4.
When you give odds on something happening, we will call this odds for
(()).
How does this relate to your belief about the underlying probability of the
event?
If we let be the probability of the event happening then the situation can
be diagramed as in next slide:
Application in Fair Betting
= ; =
+ 1 1
Application in Fair Betting
1
0.95
0.9
Probability
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0.5
0 5 10 15 20
Odds For to 1
Note on odd against
Let O(a) equal the amount the person will lose if the event happens, then
the table becomes
Event Probability Gain(Loss)
Happens 1
Doesn't Happen (1 ) ()
Probability as a Function of Odds
Against
0.5
0.4
Probability
0.3
0.2
0.1
0
0 5 10 15 20
Odds Against to 1
Application in determination of
Insurance Premium
Solving we get:
= . 100
Overhead Cost per Rs. 100 of sales = Rs. 75 and Desired Profit
= 10% of revenues. Therefore,
Subjective Company
Outcome Probability Gain P Gain
Live 0.999 Rs. 192.50 Rs.192.31
Die 0.001 - Rs. 99,807.50 -Rs.99.81
Expected
Gain = Rs.92.50
Overhead = Rs.75.00
Expected Profit = Rs.17.50
Persons Perspective
Person's
Outcome Probability Gain
Live (1 ) -192.5 0.998075 -192.129
Die 99807.5 0.001925 192.129
(1 ) (192.5) + (99807.5)
= 0 0
= 0.001925
Joint Probability Distribution
X Marginal Y Marginal
Probabilities of Probabilities of
0 0.40 0 0.49
1 0.40 1 0.32
2 0.13 2 0.12
3 0.07 3 0.07
Total 1.00 Total 1.00
Independence of Random Variables
= = 0.87 ; = = 0.77
2 = = 0.7931; 2 = = 0.8371
Is this surprising?
E(Y|x)
1.2
0.8
0.6
E(Y|x)
0.4
0.2
0
1 2 3 4
Sum of Two Variables...
I. ( + ) = () + ( )
II. ( + ) = () + ( ) + 2 (, )
+ = + = 1.162
Let
= total number of successes (denoted by S) out of 4 sales calls
= number of successes before the first failure (denoted by F) in
the same 4 sales calls
Assumptions:
I. The success probability for a call is 0.5 (or 1/2).
II. The outcomes of different calls are independent.
SSSS FFFF
SSSF FFFS
SSFS FFSF
SFSS FSFF
FSSS SFFF
SSFF FFSS
SFSF FSFS
SFFS FSSF
Since the success probability is 0.5, each possible outcome has a
1
probability of = 0.0625.
16
Application: Mutual Fund Sales
X X
SSSS 4 FFFF 0
SSSF 3 FFFS 1
SSFS 3 FFSF 1
SFSS 3 FSFF 1
FSSS 3 SFFF 1
SSFF 2 FFSS 2
SFSF 2 FSFS 2
SFFS 2 FSSF 2
x P(x)
0 0.0625
1 0.25
2 0.375
3 0.25
4 0.0625
Recall that X is a Binomial Distribution with parameters = 4
and = 0.5.
Please check that () = 2; () = 1.
Application: Mutual Fund Sales
0.4
0.35
0.3
Probability
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Values of X
Application: Mutual Fund Sales
SSSS 4 4 FFFF 0 0
SSSF 3 3 FFFS 1 0
SSFS 3 2 FFSF 1 0
SFSS 3 1 FSFF 1 0
FSSS 3 0 SFFF 1 1
SSFF 2 2 FFSS 2 0
SFSF 2 1 FSFS 2 0
SFFS 2 1 FSSF 2 0
Application: Mutual Fund Sales
y P(y)
0 0.5
1 0.25
2 0.125
3 0.0625
4 0.0625
Y is actually a Right Truncated Geometric Distribution with
with at 4 and = 0.5.
Please check that () = 0.9375; () = 1.433594.
Application: Mutual Fund Sales
0.6
0.5
0.4
Probability
0.3
0.2
0.1
0
0 1 2 3 4
Values of Y
Application: Mutual Fund Sales:
Bivariate Distribution of X and Y:
0.2
0.18
0.16
0.14
Probability
0.12
0.1
0.08
0.06
0.04 4
0.02
2
0
0 1 0
Y Values
2 3 4
X Values
Application: Mutual Fund Sales:
Bivariate Distribution of X and Y:
Please check that () = 2.6875.
(, ) = 0.8125
0 1 2 3 4
(| = 0) 1 0 0 0 0
(| = 1) 0.75 0.25 0 0 0
(| = 2) 0.5 0.33333 0.16667 0 0
(| = 4) 0 0 0 0 1
Application: Mutual Fund Sales:
Conditional Expectation of given :
For each given value for , we can now compute the expected
value of .
4.5
4
3.5
3
E(Y|X=x)
2.5
2
1.5
1
0.5
0
0 1 2 3 4
Given X Values
Application: Mutual Fund Sales:
Conditional Expectation of given :
Observe that these conditional expected values vary depending on
the given value of .
In the context of our problem, this means that the greater the total
number of successes, the longer the run of successes before first
failure.