You are on page 1of 19

Unit- Six

Sampling And Sampling Distribution


 Population?
Should be defined on the basis of the objective of
the study by the investigator.
 Due to d/t constraints it is usually difficult to
consider each and every observation in a population
for a given study; we would rather consider part of
the population. This method of study is what comes
to be called sampling survey.

1
Sampling
Reasons for Sampling
 Some specific situations under which we should run
sampling include when:
1. Time, money and other resources are limited.
2. census survey is not possible.
3. The scope of study is very wide and the population is
not known.
4. The population is too large (e.g. trees in a jungle) or
hypothetical (like tossing a coin).
5. Testing is destructive.
2
Key Terms

Sampling: It is the method by which we select a sample


from the population. By studying this part (sample), we
try to generalize findings of the sample to the population.

Source/target/ reference population: It is the population of


interest from which we select a sample and about which
we make conclusions (generalizations) by studying the
sample.
Sample / study population: It is the population included in
the sample selection

3
Key Terms...
_ Sampling unit: It is the unit of selecting sample in the
sampling process.
_ Study unit: The population on which the measurement is
done (the data is collected).
_ Sampling fraction: The ratio of the number of units in the
sample to the total size of the population in the sampling
frame (i.e. n/N)
_ Sampling interval: The ratio of total size of the population to
the size of the sample (i.e. N/n)
Sampling frame: The list of all the sampling units in the source
population and from which a random sample is to be drawn.

4
Errors in a Survey
1. Sampling error: random error introduced due to errors in
selection of a sample.
 They cannot be avoided or totally eliminated.
 It occurs by chance.
 Increasing the sample size can minimize random error. As
the size of n ~ N,
sampling ~ error=0

5
Errors in a Survey cont’d…
2. Non-sampling error (Bias)
 It could be introduced during:
 measurement or counting (i.e. observational error).
 Respondent or non-respondent error.
 Lack of preciseness of definition.
 Errors in editing and tabulation of data, and
 selection bias (e.g. accessibility bias, volunteer bias, etc).
 It is a systematic error that cannot be avoided or
minimized by increasing the sample.

6
Sampling methods
 There are two basic types of sampling: Probability and
non-probability sampling
A) Probability sampling: it is a sample obtained in a way
that ensures that every member of the population has a
known, non-zero probability of being included in the
sample.
B) Non-probability sampling
The probability of selecting a subject is unknown. We
cannot calculate the sampling error. We cannot
generalize (statistically infer) findings of a sample to the
population.
7
Sampling Distributions

 The distribution of all possible values of a given


statistic, computed from all possible samples of the
same size randomly drawn from the same population is
called the sampling distribution of that statistic.

When sampling is from population of size N, the


possible number of samples of equal size, n, that can
be drawn is given by:
Number of possible samples = Ncn = (N!) / (n!*(N-n)!)

8
Sampling Distributions cont’d…

• The three important characteristics of sampling


distribution are;
 Its mean (μx)
 Its variance (σ2 x), and
 Its functional form which describes how it looks
like when graphed (i.e. normally distributed or
not).
Sampling distribution of the Sample mean

• Sampling distribution of sample means is one


of the fundamental concepts in statistical
inference. Since it is a frequency distribution,
it has its own mean, and standard
deviation, .
• This standard deviation of the sampling
distribution of means is called standard error
of the mean and is given by
Sampling distn of the Sample mean cont’d…

• Ex. Population of heart rate data:


67 68 69 72
=69 2 = 2.1875 = 1.48

• If we take a random sample of size two (n=2)


from this population (N=4) we will have;
Ncn = 4c2 = 6 possible samples.
Sampling distn of the Sample mean cont’d…
• The six possible random samples of size 2:
2
Sample number sample data

67, 68 67.5 4556.25


1
67, 69 68 4624
2
67, 72 69.5 4830.25
3
68, 69 68.5 4692.25
4
68, 72 70 4900
5
6 69, 72 70.5 4970.25

Total  = 414  = 28573


Sampling distn of the Sample mean cont’d…

 The mean of means, = 414/6 = 69 = 


 The SD of means, = 1.08 

• This SE is multiplied by sqrt(N-n/N-1); the fpc, when


sampling is from finite population with known size, N and
when n/N > 5%.fpc.ppt

• This value, which is the measure of the variation among the


six means, is the empirical method to estimate the standard
error of the mean for the sample size of two.

• The SD of the means (SEM) is used to estimate the


ballpark or the interval for the true population mean.
Sampling distn of the Sample mean cont’d…

• Generally we can summarize the sampling distribution of


the sample mean under the following three conditions;
A. Sampling from a normally distributed population with
a known population variance, 2.
– The sampling distribution will be normal with mean 
and variance 2 /n.
B. Sampling from normally distributed population with
unknown variance.
– The means will have approximate normal distribution
when the sample size is large with mean  and variance
S2 /n
– The means will have a t-distribution with a df= n-1,
when the sample size is small. t-distribution, df.ppt
Sampling distn of the Sample mean cont’d…

C. Sampling from a population of non-normal/unknown


distribution
– The means will have an approximate normal distribution
as stated by the central limit theorem.

• Note: The fpc factor could be used in computing S.E when


sampling is from finite population of size N & n/N>5%.
Sampling distn of the Sample mean cont’d…

• So, when the distribution of means is normal, it can then


be transformed to standard normal (z) distribution and
applied in computing probabilities. This application is of
course the basis for estimation and hypothesis testing.

Z=  N (0, 1)
where =
• The distribution has similar empirical construction to that of the sample
mean. The properties of the sampling distribution of sample proportions are;

You might also like