You are on page 1of 25

Sampling Distribution

SJ10303/SM10303 Economics Statistics


Dr. Siti Rahayu Binti Mohd Hashim

1
Background
Functions of random variables

Moments

Sampling distribution of means

Sampling distribution of the difference between two


means

Sampling distribution of variances

2
Functions of random variables
Probability distribution of a function of one or more random variables
Function -> averages, sums, or any linear combinations (eg. Sums of
squares etc.)
One to one transformation:
Y = u(X) ? Probability distribution of Y
x ------ y = u(x) and y --- x = w(y)
Y assumes the value y when X assumes the value w(y)

The probability of Y:
g(y) = P(Y = y) = P[X = w(y)] = f[w(y)]

3
Theorem 1
Suppose that

X is a discrete random variables with pdf. f(x)

Let Y = u(X) is a one to one transformation between the values of X


and Y

In which, y = u(x) can be uniquely solved for x in terms of y, say

x = w(y).

Then, the pdf for Y is

g(y) = f[w(y)]

4
Example 1
Let X be a random variable with probability distribution
1
= 3 ; = 1,2,3
0;
Find the probability distribution of the random variable Y = 2X 1.

Solution:
Y = 1, 3, 5
g(y) = 1/3

5
Theorem 2
Suppose that,

X1 and X2 are discrete random variables with joint probability


distribution f(x1,x2).

Let Y1 = u1(X1,X2) and Y2 = u2(X1,X2) are both one-to-one transformation


between the points (x1,x2) and (y1,y2)

In which, y1=u1(x1,x2) and y2=u2(x1,x2) may be uniquely solved for x1


and x2 in terms of y1 and y2, say x1=w1(y1,y2) and x2=w2(y1,y2)

The joint pdf of Y1 and Y2 is

g(y1,y2) = f[w1(y1,y2),w2(y1,y2)]
6
Example 2
Let X1 and X2 be discrete random variables with the multinomial distribution

1 2 21 2
2 1 1 5
1 , 2 =
1 , 2 , 2 1 2 4 3 12

for 1 = 0,1,2 ; 2 = 0,1,2 ; 1 + 2 2 ; and zero elsewhere. Find the


joint probability distribution of 1 = 1 + 2 and 2 = 1 2 .

Solution:

1 = 0,1,2; 2 = -2, -1, 0,1,2; 2 1 ; 2 + 1 = 0,2,4

2 1 +2 /2 1 2 /2 21
1 1 5
1 , 2 = 1 + 2 1 2
, , 2 1 4 3 12
2 2 7
Theorem 3
Suppose that,

X is a continuous random variable with pdf f(x)

Let Y=u(X) is a one-to-one correspondence between the values of X


and Y

In which, the equation y=u(x) can be uniquely solved for x in terms


of y, say x = w(y).

Then, the probability distribution of Y is

g(y) = f[w(y)] |J|

where J = w(y), the Jacobian of the transformation 8


Theorem 4
Suppose that,

X1 and X2 are continuous random variables with joint probability distribution


f(x1,x2)

Let Y1=u1(X1,X2) and Y2=u2(X1,X2) both are one-to-one transformation between


the points (x1,x2) and (y1,y2)

In which, the equations y1=u1(x1,x2) and y2=u2(x1,x2) may be uniquely solved for
x1 and x2 in terms of y1 and y2, say x1=w1(y1,y2) and x2=w2(y1,y2)

Then, the joint probability distribution of Y1 and Y2 is

g(y1,y2)=f[w1(y1,y2),w2(y1,y2)]|J|

where the Jacobian is 22 determinant 9


Theorem 5
Suppose that,

X is a continuous random variable with pdf f(x)

Let Y=u(X) define a non-one-to-one transformation between the value of X


and Y

If the interval over which X is defined can be partitioned into k mutually


disjoints sets such that each of the inverse functions

x1=w1(y), x2=w2(y), . , xk=wk(y)

of y=u(x) defines a one-to-one correspondence.

Then the pdf of Y is: g(y) =

where Ji=wi(y), i=1,2,,k. 10


Moments
First moment about the origin : 1 = ()

Second moment about the origin : 2 = 2

Definition 1:

The r-th moment about the origin of the random variable X is given by


;
= =

;

11
Sampling Distributions
Statistical inference - generalizations and prediction

Definition 1:

The probability distribution of a statistic is called a sampling


distribution

Sampling distributions depend on:

a) The size of population

b) The size of the samples

c) The method of choosing the samples


12
Sampling Distributions of Means
Let, a random sample of n observations is taken from a normal population
with mean and variance 2

Each observation , i=1,2,,n, of the random sample will then have the
same normal distribution as the population being sampled.

2
~ N(, )

13
Central Limit Theorem
If X is the mean of a random sample of size n taken from a population with
mean and variance 2, then the limiting form of the distribution of

=
/
as n ---> , is the standard normal distribution.

n 30; good normal approximation regardless of the shape of the


population
n < 30; good normal approximation if the population is not too different
from a normal distribution
If the population is known to be normal, then the sampling distribution of
is normal.
14
Example
An important manufacturing process produces cylindrical component parts for the
automotive industry. It is important that the process produce parts having a mean
of 5mm. The engineer involved conjectures that the population mean is 5.0mm. An
experiment is conducted in which 100 parts produced by the process are selected
randomly and the diameter measured on each. It is known that the population
standard deviation = 0.1. The experiment indicates a sample average diameter
5.027mm. Does this sample information appear to support or refute the engineers
conjecture?

Solution:

The main issue here is: How likely is it that one can obtain average 5.027mm
with n = 100 if the population mean = 5.0mm?

15
Example: cont.
The conjecture is not refuted, if the probability suggest that = 5.027mm is not
unreasonable.

If the probability is quite low -> the data do not support the conjecture that = 5.0mm

If the mean, = 5.0mm, what is the chance that will deviate by as much as
0.027mm?

16
Example: cont.
Standardizing according to the Central Limit theorem.

If the conjecture = 5.0 is true, is N(0,1).

Thus,

Thus one would experience by chance an that is 0.027mm from the mean in
only 7 in 1000 experiments.

As a result, this experiment with = 5.027 certainly does not give supporting
evidence to the conjecture that = 5.0mm.

17
Sampling Distribution of the Difference
Between Two Averages
If independent samples of size n1 and n2 are drawn at random from two
populations, discrete or continuous, with means 1 and 2 and variances
12 and 22 , respectively.

The sampling distribution of the differences of means, 1 2 , is


approximately normally distributed with mean and variance given by

11 22
12 = 1 2 and 212 = 21 + 22 = 1
+ 2

1 2 1 2
Hence, = is approximately a standard normal variable.
12 /1 + 22 /2
18
Example
Two independent experiments are being run in which two different types of
paints are compared. Eighteen specimens are painted using type A and the
drying time in hours is recorded on each. The same is done with type B. The
population standard deviations are both known to be 1.0. Assuming that the
mean drying time is equal for the two types of paint, find ,
where and are average drying times for samples of size nA = nB = 18.

Solution:

From the sampling distribution, we know that

19
Example: cont.
Corresponding to the value, = 1.0, we have

So we have,

what might we learn from this result?


Initial presumption: and suppose the experiment is actually
conducted for the purposes of drawing an inference regarding the equality of
the 2 parameters
If the 2 averages differ by as much as 1 hour (or more), this clearly would
lead one to conclude that the population mean drying time is not equal for
the 2 types of paint.

20
Example: cont.
Suppose that the difference in the two sample averages is as small as, say, 15
minutes,

Under the same presumption,

Since this probability is not excessively low, one would conclude that a
difference in sample means of 15 minutes can happen by chance.

As a result, that type of difference in average drying time certainly is not a


clear signal that

21
Sampling Distribution of 2
If 2 is the variance of a random sample of size n taken from a normal
population having the variance 2 , then the statistic

2 1 2 2
= = =1
2 2

has a chi-squared distribution with v = n-1 degrees of freedom

22
Example:
A manufacturer of car batteries guarantees that his batteries will last, on the
average, 3 years with a standard deviation of 1 year. If five of these batteries
have lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2 years, is the manufacturer still
convinced that his batteries have a standard deviation of 1 year? Assume that
the battery lifetime follows a normal distribution.

Solution: We first find the sample variance:

Then,

23
Example: cont.
3.26 is a value from a chi-squared distribution with 4 degrees of freedom

Since 95% of the chi-squared values with 4 degrees of freedom fall between
0.484 and 11.143, the computed value with variance = 1 is reasonable

Therefore, the manufacturer has no reason to suspect that the standard


deviation is other than 1 year.

24
Chi Squared Probability Distribution
In probability theory and statistics, the chi-squared distribution (also chi-
square or -distribution) with k degrees of freedom is the distribution of a
sum of the squares of k independent standard normal random variables.

The continuous random variable X has a chi-squared distribution, with v


degrees of freedom, if its density function is given by

f(x)= 0, elsewhere and v is a positive integer

The mean and the variance of the chi-squared distribution are: = v , 2= 2v

25