4 views

Uploaded by Sahil Jain

ad

- Sample Test for Exam 2
- Stt630chap4
- Textbook_Tutorial 05_Discrete Probability Distribution
- chp 5
- DDE Prob Lecture Notes
- 5 Normal Distribution
- Summary
- Expectation and Variance
- Notes on Distributions
- jenree.pptx
- Ch6 Statistical Hydrology
- chapter7-IPS14e.pptx
- Possible Exam Questions
- PPT+Ch#7
- index.pdf
- CEE 201 Syllabus - 2012
- Rel
- control 2
- 3.Kinetic.full
- DeMiguel Garlappi Uppal on Naive vs Optimal Diversification RFS 2009

You are on page 1of 83

Apratim Guha

apratim@iima.ac.in

Sessions 1-5

2

Basic Concepts-I

An experiment is an occurrence whose result, or outcome,

is uncertain.

The set of all possible outcomes is called the sample space

for the experiment.

3

Example

4

One Die

Two Dice

5

Example 1

A readership survey conducted among the adult

population showed that 35% read Times, 15% read

Express and 25% read Herald; 10% read both Times and

Express, 8% read both Express and Herald, 5% read both

Times and Herald; 4% read all three publications.

Sample space =?

6

Basic Concepts-II

Given a sample space S, an event E is a subset of S.

outcome of that experiment is one of the elements of E.

7

Example 2

When a fair die is rolled, probability of each outcome is 1/6.

But a die does not need to be fair, and outcomes are not

always equally likely!

probabilities of 1, 2, 3, 4, 5 and 6 respectively 0.25, 0.15,

0.15, 0.15, 0.15 and 0.15.

Then P(E) = ?

8

Basic Concepts-III

For any event A:

i) 0 P(A) 1.

ii) P() = 0, where is the null event (no outcomes).

iii) P(S) = 1, where S is the sample space (all outcomes).

iv) If A and B are two events with no common outcome, (i.e.

mutually exclusive) then

P(A B) P(A) P(B)

v) If A and B are two mutually exclusive events such that

P(A B) 1 then A and B are called complements of each

other. We denote B as A or AC.

9

Example 2 continued

Consider the fair die being rolled twice

P(sum is even) = ?

P(product is odd) =?

P(sum is even or product is odd) =?

10

General Formula

11

Example 2 continued

Consider the fair die being rolled twice

P(sum is even) = ?

P(product is odd) =?

P(sum is even or product is odd) =?

12

Example 1 (continued)

A readership survey conducted among the adult

population showed that 35% read Times, 15% read

Express and 25% read Herald; 10% read both Times and

Express, 8% read both Express and Herald, 5% read both

Times and Herald; 4% read all three publications.

Sample space =?

Probability of reading exactly 2 newspapers =?

13

Example 2 continued

Work out the probabilities

i) P(sum is even given the product is odd)

ii) P(product is odd given sum is even)

How to proceed?

14

Conditional Probability

Let A, B be two events such that P(A) > 0. Then

P ( A B)

P( B | A)

P ( A)

is the conditional probability of B given A.

Conditional probability given an event A is computed by

restricting consideration only to A.

15

16

Let A and B be two events such that 0 < P(A) < 1. Then

P ( B ) P ( B | A ) P ( A ) P ( B | A ) P ( A ).

Note:

1. P(A B) P( B | A)P(A) .

2. P( A B C) P(C | A B) P(B | A)P(A)

Chain rule.

17

Let A and B be two events such that 0<P(A)<1 and P(B)>0.

Then

P ( B | A )P ( A ) P ( B | A)P ( A)

P ( A | B) .

P ( B | A )P ( A ) P ( B | A )P ( A ) P ( B)

18

Machines A, B and C manufacture 25%, 35% and 40% of

total production respectively.

Of their output 5%, 4% and 2% are defective respectively.

A bolt is drawn at random from the produce and is found

to be defective.

What is the probability that it was manufactured by

machine A?

19

20

Disease Screening

A: individual has the disease

A : individual does not have the disease

Test:

B: test result indicates individual has disease,

P(A|B) = ?

21

Tree Diagram

22

Interpretation

Illustration:

P(A) = 0.0001

P(B|A) = 1

P(B| A ) = 0.0001

P(A|B) =?

What if P(A) = 0.01?

Conclusion: Test is not good enough unless p is much

higher that .

23

Independence

Two events A and B are called independent, if

P(A B) P(A)P(B)

Idea comes from the fact that A and B are independent means

P(A | B) P(A)

and/or

P(B | A) P(B)

What happens to the chain rule in case of independence?

24

Example

Randomly drawing from a pack of cards, getting a king is

independent of getting a spade.

Note

Mutually exclusive (non-null) events are never independent.

(Non-null) Sub-events are never independent of the

corresponding super-events, and vice-versa, (unless ?)

25

Independence

When are three events independent?

Example

Two fair dice are rolled.

A= getting a six in the first die

B= getting a six in the second die

C= getting equal outcomes in the two die

Are they independent?

26

Risk

What would you prefer: Rs. 1000 for sure or an

investment that gives a return of Rs. 2000 with probability

0.5 and a loss of Rs. 1000 with probability 0.5?

If you are risk-prone, youll go for the investment

If you are risk-averse, youll prefer the sure amount

neutral point of view.

DISTRIBUTIONS/

RANDOM VARIABLES

28

Random Variables

Random variable: Response of random experiments taking

different numerical values with certain probabilities.

The probability models describe the random variables.

EXAMPLES

Number of car crashes in Nagpur tomorrow

Amount of rainfall in India next month

Salary offered to a PGPN passout

Length of time a cancer patient survives after detection

29

Takes finitely/countably many values: typically integers

numbers of something:

Number of earthquakes in Japan in one year

Number of errors in each page of a text book

Number of heads in 1000 coin tosses

30

Formal Description

Value (x) Probability (p(x))

1 p(1)

2 p(2)

Total 1

Values can, of course, be any other set of integers.

31

Distribution Function

A random variable X can be characterised by its (Cumulative)

Distribution Function F(x) = P(Xx).

Number of accidents are below a certain margin

Rainfall is below a certain amount

Salary is above a certain threshold

The patient survives more than some stipulated time limit

Survival Function.

32

Some Properties

1. 0 F(x) 1.

2. F(x) is non-decreasing.

x x

.

33

Distribution function

The distribution function for a discrete random variable

looks like a step, and hence is called a step function.

Example

p(1) = 0.1, p(2) = 0.2 and p(5) = 0.7.

What is the distribution function?

34

Expectation

Expectation of a random variable is the weighted average

of its values, weighted by the corresponding probabilities:

variable X taking values x1, x2, , xn with probabilities

p(x1), p(x2), , p(xn) is given by the weighted average

E(X) xip(xi ).

n

i 1

35

Examples

1. p(1) = 0.1, p(2) = 0.2 and p(5) = 0.7.E(X)=?

time I go to the store with probabilities 0.4 and 0.4,

respectively. The packets cost Rs.10 each. However,

sometimes, (with 0.2 probability,) there is a sale when

the packets are sold at Rs.8 per pack, when I buy 10

packs. What is my expected cost?

36

Properties of a Function

A biased coin is tossed. P(head) = p.

Let X = 1 if head, 0 if tail. E(X) = ? V(X)=?

37

1. If X is a random variable, and a and b are two constants,

then

E(aX) = aE(X), E(b+X) = b+E(X).

Eg. If $1 = Rs. a, then if expected salary of a PGP passout

in US$ is $X, in Rupees it is Rs.aX. Then again, if everyone

is paid a joining bonus $b, expected salary is $(b+X).

2. If X and Y are two random variables, then

E(X+Y) = E(X) + E(Y).

Eg. Expected sum of salary of two friends is the sum of

their expectations.

38

Variance

The variance of a random variable measures its spread.

For a discrete random variable X, variance is given by

the weighted average of the squared deviations from

the expectation

V(X) (xi - E(X)) p(xi ) xi2p(xi ) E(X)2

n n

2

i 1 i1

39

1. If X is a random variable, and a and b are two constants,

then

V(aX) = a2V(X), V(b+X) = V(X).

2. If X and Y are two independent random variables, then

V(X+Y) = V(X) + V(Y).

Otherwise there will be a cross-term. Lets not bother about

that here.

40

Two dice

41

Two Dice

Y= sum of outcome based on two throws of a fair die

Any easier way to compute it?

42

When I repeat an exercise which results in only two possible

outcomes, how do I obtain the probability distribution of

number of occurrences of what I want?

Examples:

Probability distribution of the number of rainy days in Nagpur

this year.

Probability distribution of number of children out of 100

randomly chosen kids of age 10 who have dropped out of

school.

43

Example

10 shots fired at a target.

P(Success) = 0.2

P(at least 2 hits in 10 shots) =?

P(at least 2 hits in 10 shots| at least 1 hit in 10 shots) =?

44

A Bernoulli trial is a random event with only two possible

outcomes, say success and failure, with fixed

probabilities. We get to define success!

A binomial random variable is the number of successes

(or failures) in a fixed number of independent Bernoulli

trials.

45

Ber(p)

1 trial with probability of success p

Y = number of success

= 0 if failure, 1 if success

P(Y=1) = p P(Y=0)=1-p.

E(Y) = p

Var(Y) = p(1-p)

46

Binomial(n,p)

n independent trials with probability of success p

X = number of success

n x n x

P(X=x) = p (1 p) , x = 0, 1, 2, ..., n WHY???

x

E(X) = np

Var(X) = np(1-p)

Use pbinom(x,n,p) for cdf.

47

P(Success) = 0.2

a)10 shots fired at a target.

P(at least 2 hits in 10 shots) =?

P(at least 2 hits in 10 shots| at least 1 hit in 10 shots) =?

b) 5 more shots are fired.

P(at least 2 hits in 15 shots)=?

What if the shots were fired from a different angle?

48

Additive Property

If X~Bin(n, p) and Y~Bin(m, p) are independent, then

X+Y~Bin(n+m, p).

Try 5 more times. Number of hits now has distribution

Bin(15, 0.2).

Note: p has to be the same, and independence is

needed.

What happens otherwise?

49

Example

X~Bin(10,0.2), Y~Bin(5,0.4) are independent.

P(X+Y>0) =?

50

Example

Assume that for a particular machine breaks down once

every month on average. Assume that the no. of breakdowns

follow a Poisson distribution.

What is the probability that there are no breakdown in 3

months?

What is the probability of exactly one breakdown in 3

months?

51

Poisson Distribution

Used to model rare events

Is an approximation of a Bin(n,p) random variable with

large n and small p such than np = is moderate

x

P(X=x) = e x = 0, 1, 2, ...

x!

E(X) = , Var(X) =

52

Additive Property

If X~Poisson(1) and Y~Poisson(2) are independent, then

X+Y~Poisson(1+2).

53

Example

What is the probability that there is exactly one breakdown

in the first two months and another one in the third month?

in the first two months and none during the last two months

during the three month period?

CONTINUOUS

DISTRIBUTIONS

55

A Continuous response or random variable, X, is

described in terms of a probability density function

(p.d.f.) f(x):

P(a < X < b) = f ( x )dx (area under the p.d.f. From a to b)

b

We need

1. f(x) 0,

2. Total area = 1, i.e. f ( x )dx 1.

56

For a continuous random variable, C.D.F. F(x) =P(X

x) is given by

x

F( x ) f ( y )dy.

Mean and Variance of a continuous random

57

variable

The mean is given by

= E(X) = xf ( x )dx.

2 = E[(X-)2] = x f(x)dx x 2 f(x)dx 2 .

2

58

X is uniformly distributed on an interval (a,b) if it has the

density function

1

f(x) = for a x b

b a

Typically used to model

small errors, e.g.

0 otherwise rounding off errors.

Notation: X~U(a,b).

Mean:

ab Variance: b a 2

, .

2 12

quantile/100pth percentile.

Example

59

distributed between 0.99 and 1.02.

What is the probability that a screw is longer than 1.01?

In a pack of 120 screws, how many are expected to be

longer than 1.01?

60

Normal Distribution

Found almost everywhere: considered to be the natural

distribution for a number of features for large groups: e.g.

Height, Weight, Grades ...

distributions.

61

Normal Distribution

The p.d.f. of normal distribution with mean and

variance 2 is

( )

f(x)=

X~N(,2).

If =0 and 2=1, then X has the standard normal

distribution.

P.D.F. of standard normal: (.)

CDF : (.)

Properties

62

to tabulate the standard normal distribution values.

2. Normal p.d.f. is symmetric around its mean , and its

shape depends on s.d. . The higher the sd, the flatter the

curve.

63

symmetric around 0, i.e. (-z) = (z).

2. This means (-z) = 1- (z).

3. We can re-write this as (z) + (-z) = 1, so, (0) = 0.5.

64

Here F(z) (or (z)) is the area under the standard normal pdf

(.) upto z.

Example

65

What is the 80-th percentile of X?

qnorm(p,,): pth quantile/100pth percentile.

66

Result

For two independent normal random variables X~N(a,v)

and Y~N(b,u),

i. X+Y is normal with mean a+b and variance v+u.

ii. X-Y is normal with mean a-b and variance v+u.

67

Central Limit Theorem

Consider a number of random variables: X1, X2, , Xn for

some large value n

Let all of them be independent and identically distributed

(IID)

68

Example

Example

There are 3 lunch specials in a restaurant: A, B and C.

A student eats these specials with prob. 60%, 20% and 20%

resp.

A costs Rs.100, B costs Rs.140. C costs Rs.150.

a) Obtain the distn. of the students daily expenditure on

lunch.

b) Obtain the distn. of the students average expenditure on

lunch over two days.

c) Obtain the distn. of the students average expenditure

on lunch over 30 days.

69

Consider other students who go to the restaurant

and eat there as well.

Assume there are many of them, but all of them

follow the same daily demand distribution of A, B

and C.

All of them have their own X , hence we can talk of

a distribution of X

Distribution of

Sampling Distribution 70

X

Sampling Distribution 71

X1, , X n : IID Sample with mean and variance 2

n X

For large sample size n, is approximately N(0,1).

Or, equivalently:

An alternative form of the CLT

Sampling Distribution 72

Sum of samples:

n

i1

Distribution of Proportion

Sampling Distribution 73

and are typically valued 0 or 1.

Distributed as Bernoulli(p) Binomial(1,p), where probability of

event = p.

Proportion p = Total no. of occurrences/n

Total no. of occurrences can be thought of as sum of indicator

variables

Distribution of total no. of occurrences : Binomial(n, p).

variables

74

Sampling Distribution

Then the following Central Limit Theorem may be used:

p(1 p)

p ~ N p, approximately for large n

n

Alternative form:

i.e. Bin(n,p) can be approximated by N(np,np(1-p)) for large n.

75

For any value of p, a Bin(n,p) distribution can be approximated

by a N(np, np(1-p)) distribution for large enough n.

Need np 5 and n(1-p) 5 for a reasonable approximation.

Sampling Distribution 76

When can we use it:

If samples are from IID distributions

If there is moderate or no skew

Dont use it if the distributions are not IID.

Errors may be large for small samples from skewed

distributions

77

Example

Binomial(n,p) with p = 0.5, and various n.

78

Example

Binomial(n,p) with p = 0.1, and various n.

79

Depends!

Larger skew: need larger sample size

No skew: even 15 is a reasonable sample size to use the

normal approximation

Moderate skew: need 30 or more

High skew: need 50 or more

Severe skew (Example: binomial with large n and small p

so that Poisson approximation holds): might need very

high sample size, in the range of several hundred or even

higher

80

Example

Binomial(n,p) with p = 0.01, and various n.

81

Work out and X. Hence work out the approximate

distribution of the sample mean for n = 30 using CLT.

What is the probability that over 30 days,

a) average spend is at least Rs.120?

b) between Rs.110 and Rs.130?

c) What is the probability that the total spend is not more than

Rs.4000?

Sampling Distribution 82

a) What proportion of days is the student expected to spend

more than Rs.110?

b) What is the (sampling) distribution of the number of times

the student spends over Rs.110 for lunch on a day?

c) What is the (sampling) distribution of the number of times

the student spends over Rs.110 for lunch in 5 days?

d) What is the (sampling) distribution of the number of times

the student spends over Rs.110 for lunch in 50 days?

e) What is the probability that the student spends over

Rs.110 at least 30 times in 50 days?

83

Quick questions

I collect a sample, without replacement, of 100 children

of age 6 from Nagpur.

a) Does the CLT say that their height distribution is No.

approximately normal?

b) Does the CLT say that the distribution of their average Yes*

height is approximately normal?

c) I record the heights of 15 students of age 6 and 15 No

students of age 10. Is the distribution of the average

height approximately normal? What if I recorded Yes*

50+50?

I record the heights of 20 students sampled, without

No.

replacement, from this class. Does the CLT say that the

average height is approximately normal?

- Sample Test for Exam 2Uploaded byjkhimthang9468
- Stt630chap4Uploaded byPETER
- Textbook_Tutorial 05_Discrete Probability DistributionUploaded byKenneth Woong
- chp 5Uploaded byMehmet Kamal
- DDE Prob Lecture NotesUploaded byChidambaranathan Subramanian
- 5 Normal DistributionUploaded byNoor Hafizah
- SummaryUploaded byJasonLy
- Expectation and VarianceUploaded bydwija
- Notes on DistributionsUploaded byMadhavi Gundabattula
- jenree.pptxUploaded by김태태
- Ch6 Statistical HydrologyUploaded bykundan
- chapter7-IPS14e.pptxUploaded byRicky Justin Ngo
- Possible Exam QuestionsUploaded byjayc11
- PPT+Ch#7Uploaded byEssayas Haile
- index.pdfUploaded byAmador Perez Lopez
- CEE 201 Syllabus - 2012Uploaded byKhaldoun Atassi
- RelUploaded byNkechi Joy
- control 2Uploaded byWalther Ortiz Suasnabar
- 3.Kinetic.fullUploaded byTJPRC Publications
- DeMiguel Garlappi Uppal on Naive vs Optimal Diversification RFS 2009Uploaded byfc297
- Sulfur species in volcanic gasesUploaded byGustavo Maureira
- 13B.docxUploaded byJamie Schultz
- Reliability_analysis_121101.pdfUploaded byYathrib Eid
- 6683_01_pef_201303071Uploaded byAhadh12345
- 6 sebaran penarikan contohUploaded bysudahkuliah
- 02 Probability, Bayes Theorem and the Monty Hall ProblemUploaded byuyjco0
- STPM Trials 2009 Math T Paper 2 (SMI Ipoh)Uploaded byHaRry Chg
- FormulaUploaded byAlexandra Diana Duda
- Jurnal RSKE Irianto (1996)Uploaded byIdriwal Mayusda
- BallsUploaded bykishoje

- W-8BEN-E-VmwareUploaded bySahil Jain
- Last SpeechUploaded bySahil Jain
- Da Wk2 HandoutUploaded bySahil Jain
- P16022 IIMN_Service Marketing_Assignment IUploaded bySahil Jain
- Calculations Slide 12Uploaded bySahil Jain
- STMK Team DiaryUploaded bySahil Jain
- Da Wk4 Segtarg HandoutUploaded bySahil Jain
- Service MarketingUploaded bySahil Jain
- Durgesh EulogyUploaded bySahil Jain
- Pricing Week4 FullUploaded bySahil Jain
- Syllabus for MCUploaded bySahil Jain
- Investments SyllabusUploaded bySahil Jain
- Questions for CaseUploaded bySahil Jain
- Segemntation Without DiscriminantUploaded bySahil Jain
- Segemntation Without DiscriminantUploaded bySahil Jain
- Assigning Club ProxyUploaded bySahil Jain
- Assigning Club Proxy-.pdfUploaded bySahil Jain
- 20170621 GE MatrixUploaded bySahil Jain
- TemplateUploaded bySahil Jain
- GGNUploaded bySahil Jain
- Reliance finance predictionUploaded bySangeeth Suresh
- Illume 2.0 - Brochure (1)Uploaded bySahil Jain
- Draft Reply if Speaker Shows InterestUploaded bySahil Jain
- P160XX SuggestionsUploaded bySahil Jain
- AMR DataUploaded bySahil Jain
- InitiateSingleEntryPaymentSummary24-07-2017Uploaded bySahil Jain
- Mess_Menu_2Uploaded bySahil Jain
- IIMN PGP 2017-19 List of StudentsUploaded bySahil Jain
- Group 1_ NatureviewUploaded bySahil Jain
- IpsUploaded bySahil Jain

- Product Reference Manual 2010Uploaded byKhan Watson
- Ct AnalyzerUploaded byshahmoradi2003
- Manual Taller Superlight 125 CC (Idioma Ingles)Uploaded byCordobessa
- brochure-gendex-gxdp-700-02-13Uploaded bySuryaAtmajaya
- Paper 1 Foundation- Question PaperUploaded byprettyprincess
- CMA Part 1 Summary of Part 1 -2015Uploaded bySiddharthaSaiKrishnaGonuguntla
- weibullUploaded byveeraj_seeboruth
- Dwight Moody EtcUploaded byPrasad Rn
- d250s Dual En chargerUploaded byalroma111266
- Create Standby Database Using RMANUploaded byJabras Guppies
- tmp_12416-DEA 2013309907806.pdfUploaded bySullivan James
- maths term 2 week 5Uploaded byapi-249467864
- Picard, Charles.pdfUploaded bycreesha
- Worm_Gear_Screw_Jacks_ctuk.pdfUploaded byMiguelRagas
- Changhong Batteries CatalogueUploaded byHoracio Berni
- NAPA steel.docxUploaded byarokiarajprabhu
- MDM_ZINCv2.5_UserManualUploaded bySaurabh Ambulkar
- Topographic Map of BlackfootUploaded byHistoricalMaps
- Week 1 Stat FridayUploaded byEctebookplease
- Dimensionality Reduction Techniques In Response Surface DesignsUploaded byinventionjournals
- ATMEGA324Uploaded byTho Ha
- Add Math f4 Final 2011 Melaka p2 AnsUploaded bySaravanan Rajagopal
- Course Work EngineeringUploaded byOghosa Ken O
- Handout-7.pdfUploaded byRecall
- MIDAS DrawingShopUploaded byKapil Dev Bansal
- 2 WSD.docUploaded bylemi celemen
- 0495804118_154117 (1)Uploaded bygghhsd
- GSM RN OptimizationUploaded byPalash Sarkar
- CSEC Revision Chemistry SampleUploaded byimanuel31
- Gwald-RCbox30 mUploaded byPawan Garg