You are on page 1of 48

Sampling fundamentals

1
Introduction
The need for adequate and reliable data is ever increasing
for taking wise decisions in different fields of human activity
and business. There are two ways in which the required
information may be obtained:
1. Complete enumeration survey or Census method.
2. Sampling method.
In the first case, data are collected for each and every unit.
i.e Universe/ population (complete set of items).

2
What is Population?
In any field of inquiry, all the items under consideration
constitute ‘population’ or ‘universe’.

A complete enumeration of all the items of ‘population’ is


known as a census inquiry. In such an inquiry it is assumed
that highest accuracy is obtained.
But this type of inquiry involves a great deal of time, money
and energy. Not only this, census inquiry is not possible in
practice under many circumstances.

3
Sample

Hence, quite often we select only a few items from the


universe for our study purposes. The items so selected is
technically called sample.

4
What is Sampling Process?
Sampling may be defined as the selection of some part of an
aggregate or totality on the basis of which a judgment or
inference about the population (aggregate or totality) is
made.
In other words, it is the process of obtaining information
about an entire population by examining only a part of it.

5
Need for sampling

1. Sampling can save time and money.


2. Sampling may produce more accurate information if it is
conducted by trained and experienced investigator.
3. Sampling becomes the only option when the population
size is infinite.

6
Sample Design
 Researcher must prepare a sample design for his study i.e., he must plan
how a sample should be selected and of what size a sample would be.
 Large and Small Sample: Let the population size be N and a part of size n (
which is less than N) of this population is selected according to some
rule for some characteristics of the population. The group consisting of
these n units is known as ‘sample’. Therefore n denotes sample size. If
n>30 then it is considered as large sample, otherwise it is known as small
sample.

 The selection process i.e. the way the researcher decide to select a
sample from the population is known as the ‘sample design’.
In other words, it is a define plan ( determined by the researcher)
before any data is collected for obtaining a sample from a given
population.
Eg : Research on pharmaceutical industry.

7
Sampling Method/Sampling Technique
1. Probability Sampling

2. Non-probability sampling

8
Probability Sampling

 Probability sampling is also known as ‘random sampling’


or ‘chance sampling’.
Samples selected according to some chance are known as
random or probability samples i.e. every item in the
population has known chance of being included in the
sample.

9
Non-probability sampling

On the other hand, non-random or non-probability samples


are those where the selection of sample unit is based on
the judgment of the researcher than randomness.

10
Important Sampling Designs
Probability Sampling:
i. Simple Random Sampling

ii. Systematic Sampling

iii. Stratified Sampling

iv. Cluster Sampling

11
Major non-probability sampling are:
i. Deliberate Sampling/ Purposive sampling/ Judgment
sampling

ii. Quota Sampling

12
Simple Random Sampling Method
Under this sampling design , every item of the universe has
an equal chance of inclusion in the sample.

For example, if we have to select a sample of 300 items


from a universe of 15,000 items, then we can put the names
or numbers of all the 15,000 items on slips of paper and
conduct a lottery.

13
Under this method, sampling is done without replacement, so
that no unit can appear more than once in the sample.
Thus, if from a population consisting of 4 members A, B, C
and D, a simple random sample of n=2 is to be drawn, there
would be 6 possible samples without replacement. They are
AB, AC , AD, BC, BD, CD.
Keeping in this view, we can say that a simple random
sample of size n from population N results in N C n
possible outcomes in such that each has the same
Probability of being selected.

14
Exercise
Take a certain finite population of six elements ( say a, b, c,
d, e, f). Suppose that we want to take a sample of size n=3
from it. Find out how many possible outcomes are there?
Write the elements . Choose one sample out of it. What is
their probabilities of getting into the sample?

15
Systematic Sampling

In some instances, the most practical way of sampling is to select


every ith item from the universe where ‘i’ refers to the sampling
interval.

The sampling interval can be determined by dividing the size of


the population by the size of the sample to be chosen.
For example, if we wish to draw 32 names out of the list of 320
names, the sampling interval will be 10. It means every 100th
name will be selected. In this process a random start is always
Preferable, i. e. a start is determined by chance. If the starting no
is say 6, then the sample is composed of the number 6, 16, 26,
36,…..

16
Merits and demerits
Merits :
a. It is a simple method
b. It can be taken as an improvement over a simple random
sample as it spread more evenly over the population.
Demerits:
a. It is not truly random in the strict sense. This is because all
items selected for the sample ( except the first term) are pre-
determined by the constant interval.
b. There are certain dangers too in using this type of sampling. If
there is a hidden periodicity in the population, systematic sampling
will prove to be inefficient method of sampling. Example , quality
checking of 4% sample.

17
Stratified Sampling

If a population from which a sample is to be drawn does not


constitute a homogeneous group,( highly heterogeneous)
stratified sampling technique is generally applied in order to
obtain a representative sample. Under stratified sampling the
population is divided into several sub-populations that are
individually more homogeneous than the total population.The
different sub-populations are called ‘strata’ . Then we select
items from each ‘strata’ to constitute a sample. Since each
‘strata is more homogeneous than the population, we get a
better estimate of the whole.

18
The following three questions are highly
relevant in the context of stratified sampling
a) How to form strata?

b) How should items be selected from each stratum?

c) How many items be selected from each stratum or how


to allocate the sample size of each stratum?

19
Regarding the first question, we can say that the items
which are homogeneous ( i.e. of common characteristics)
should be put in the same group or strata. In other words
strata be formed in such a way that elements are most
homogeneous within the strata and most heterogeneous
between the different strata.

20
In respect to 2nd question, we can say that to choose the
items from each strata we normally adopt simple random
sampling.

21
To answer the 3rd question we have to
understand the following concepts:
Stratified sampling can be of two types: proportionate and
disproportionate.

In proportionate stratified sampling the number of sample


units in various strata are in the same proportion as found
in the population. Thus, larger the particular stratum, the
more weight it receives in the analysis.
Eg: If a bank wants to conduct a survey to understand the
problems that is consumers are facing .

22
Disproportionate Stratified sampling

Here the strata are represented in the total sample in a


proportion other than the one with which they are found in
the population.

23
Cluster Sampling

In Cluster sampling first we divide the population into


groups called ‘clusters’ and then select some units from the
groups or the clusters for sample. Cluster sampling is totally
opposite to stratified sampling in the sense that ,
a. The units within each cluster should be as heterogeneous
as possible.

b. There should be small difference between the clusters.

24
Ex. Suppose a departmental store in a town wants to study
only those customers ( i.e the frequency or amount of
transaction made by using credit cards)who purchase goods
from this store by using HDFC bank credit card . To know
that a survey was conducted on the behalf of the store
which reveals that the total no of customers holding cards
is 15,000. sample size is 450. For cluster sampling the list of
15000 customers could form 100 clusters of 150 members
each. From these 100 clusters we have to choose randomly
450 units.

25
Non-probability sampling
Judgment sampling: selection made by choice not by
Chance.
Example: A company wanting to launch a new product may use
judgmental sampling for selecting experts who have prior
knowledge or experience of similar products.
The most common application is B to B marketing.
Convenience sampling :
The only criterion for selecting sampling units is the
convenience of the researcher.
•People interviewed in a shopping mall
•Interviews conducted by a TV channel of people coming out of
a cinema hall, to seek their opinion
26
Snowball Sampling: Snowball sampling is generally used when
it is difficult to identify the members of the desired population.

Quota Sampling: In stratified sampling, to choose sample unit


from each strata this technique is often being used . Here the
interviewer got some quota ( a minimum number) to be filled from
each specified subgroup in the population where actual selection of
units totally depends on the interviewers judgment.

27
Sampling Vs Non-Sampling Error
There are two types of error that may occur while we are
trying to estimate the population parameters from
the sample.

1. Sampling error

2. Non-Sampling error

28
Statistic and Parameter

 A statistic is a characteristics of a sample , whereas a


parameter is a characteristic of a population.

 Thus, when we work out certain measures such as mean,


median, mode, standard deviation from sample, they are
called statistic as they will describe the characteristic of
the population.
 Eg Sample mean= x . Sample s.d. = s
 Eg of parameter , population mean= μ ,population s.d. = σp

29
Formula
 Sample Mean Formula: x =∑x
n

 Sample variance Formula s2= ∑(X-X)2


(n-1)
n= sample size
 Population Mean Formula: μ =∑x
N
 Population variance Formula σ2= ∑(X-X)2
N
N= population size.

30
The central limit theorem

Take a large (30 or more) random sample of size n from


any population with mean μ and standard deviation σ. The
sample mean, X is approaches the normal distribution
with mean μ and standard deviation  .
n
  
X ~  N   , 
 n

31
Sample size for estimating population
mean

X = Sample mean  The value of standard



μ = Population mean error is
n
N= Population Size ( when samples are drawn
from an infinite
n = Sample size population)
σ = Population standard
deviation

32
Confidence Interval: The interval within which the population
parameter is expected to lie.
Confidence Interval = Point Estimate ± Margin of Error
Margin of Error: The value added or subtracted from a point
estimate in order to develop an interval estimate of a population
parameter.
Margin of Error = Zc x Standard Error of a Particular Statistic

Zc = Critical Value of Standard Normal Variable


34
Q1.The average monthly electricity consumption for a
sample of 100 families is 1250 units. Assuming the
standard deviation of electricity consumption of all
families is 150 units, construct a 99% confidence interval
estimate of the actual average monthly electricity
consumption.
Confidence Interval = 1250 ± 2.58 x 150/√100

= 1250 ± 38.70 per month


Q2. The average monthly electricity consumption for a
sample of 100 families out of 1000 families is 1250 units.
Assuming the standard deviation of electricity consumption
of all families is 150 units, construct a 99% confidence
interval estimate of the actual average monthly electricity
consumption.
Q3.The HR department of an organization would like to estimate the
family medical expenses of its employees to determine the feasibility
of providing a medical insurance plan. A random sample of 10
employees reveals the family medical expenses (INR thousands) in
the last year as under:
11, 37, 25, 62, 51, 21, 18, 43, 32, 20
Set up a 99% confidence interval of the average family medical
expenses for the employees of this organization.
Sample Size Determination

An economist is interested in estimating the average


monthly household expenditure on food items by the
households of a town. Based on past data, it is estimated
that he standard deviation of the population on the
monthly expenditure on food item is 30 rupees. With
allowable error set at 7 rupees , estimate the sample size
required at a 90 percent confidence.

39
n= (z 2 . σ 2)/e 2

n= 50 ( approx)

40
It is desired to estimate the mean life of a certain kind
of vacuum cleaner. Given that the population standard
deviation 320 days, how large a sample is needed to be
able to assert with a confidence level of 96 per cent
that the mean of the sample will differ from the
population mean by less than 45 days ?

41
 At 96 per cent confidence Z= 2.055

n= 214 ( approx)

42
Estimating the Proportion

A manager of a company wants to estimate the proportion


of the company’s workers whose primary reason for staying
on their job . He undertook a survey of 894 respondents
with salaries below 10 lacks per year. He asked them “ What
is the Primary Reason for Staying on Your Job?” For this
sample of 894 respondents 367 indicated that the primary
reason for staying on their job was interesting job
Responsibilities.
Construct a 95 per cent confidence interval for the proportion
of the workers whose primary reason for staying on their job
was interesting job responsibilities.

43
Determination of sample size for
estimating the population proportion
Q.A consumer electronics company wants to determine the
job satisfaction levels of its employees. For this they ask a
simple question , ‘Are you satisfied with your job?’.It is
estimated that no more than 30 percent of the employees
would answer yes. What should be sample size for this
company to estimate the population proportion to ensure
95 per cent confidence in result and to be within 0.04 of
the true population parameter.

44
n= (z 2 . pq)/e 2

= 505 ( approx)

45
Q. A researcher wants to have 90 % confidence on his
estimation. He wants to estimate the proportion of office
workers who respond to office email within an hour. The
error should be within ±0.05. Since no one has previously
undertaken such a study, there is no information available W
i

from past data. Determine the sample size needed. t


h
i
n

46
Ans: 271 office workers

47
Points to be noted
The formulas are applicable For SRS only.
Determination of sample size: ‘ smaller but properly
selected samples are superior to large but badly selected
samples’
1. Resources available
2. Nature of study
3. Method of sampling used
4. Nature of respondents( response rate)
5. Nature of population ( existence of heterogeneity)

48

You might also like