Sampling Theory

50-3
4. SAMPLING
4.1.
Introduction
A process used in statistical analysis in which a predetermined number of observations will be taken from a
larger population is called sampling.
Sampling is essential technique of behavioural research; the research work cannot be undertaken without use of
sampling. The study of the total population is not possible and it is also impracticable. The practical limitation
cost, time and other factors which are usually operative in the situation, stand in the way of studying the total
population. The concept of sampling has been introduced with a view to making the research findings
economical and accurate.
For example, a fruit merchant does not examine each and every apple or mango. He inspects only a few of
them and takes decision to purchase or not to purchase. The most important aim of sampling is to obtain
maximum information about the population under study with the least uses of money, labour, and time.
According to Cocharn, In every branch of science we lack the resources, to study more than a fragment of the
phenomena that might advance our knowledge. In this definition a fragment is the sample and phenomena
is the population.
According to Davis S. Fox in the social science, it is not possible to collect data from every respondent
relevant to our study but only from some fractional part is called sampling.
4.2.
Basic Concepts of Sampling
There are some basic terms which are necessary to know the concept of sampling design:
Basic Concepts
Target Population
Sample
Sample Frame
4.2.1.
Statistical Population
Census
Sample Unit
Target Population
The population or universe represents the entire group of units which is the focus of the study. Thus, the
population could consist of all the persons in the country, or those in a particular geographical location, or a
special indigenous or economic group, depending on the purpose and coverage of the study. A population could
also consist on non-human units such as farms, houses or business establishments. For example, if an
investigation is to be conducted on the marks obtained in statistics by the students of a class then all the students
of that class in that subject will be the Universe. If that class consists of 50 students, the same 50 students will
form the Universe.
An aggregate of objects (animate or inanimate) under study is called population or universe. It is thus a
collection of individuals or of their attributes (qualities) or of results of operations which can be numerically
specified.
1) Finite Universe: A universe having a finite number of entities or members is called a finite universe. For
example, the universe of the weights of students in a particular class or the universe of smokers in Rohtak
district.
2) Infinite Universe: A universe with infinite number of members is known as an infinite universe. For
example, the universe of pressures at various points in the atmosphere.
4.2.2.
Statistical Population
A statistical population is a set of entities concerning which statistical inferences are to be drawn, often based
on a random sample taken from the population. For example, if we were interested in generalisations about
crows, then we would describe the set of crows that is of interest.
Notice that if we choose a population like all crows, we will be limited to observing crows that exist now or will
exist in the future. Probably, geography will also constitute a limitation in that our resources for studying crows
are also limited. A statistical population is an aggregate of measurable quantities or a set of numbers. In fact,
when every element of such a set is characterised by only one character, say, income of individuals, we have a
univariate population. It should be noted that a statistical population can be finite or infinite according as it
contains a finite or an infinite number of elements. Again, any arbitrary set is not necessarily a statistical
population. For example, the set of cows in a farm at a particular time, does not represent a statistical
population.
4.2.3.
Sample
Sample is a portion of the population which is examined with a view to estimating the characteristics of the
population. It is a subset containing the characteristics of a larger population. Samples are used in statistical
testing when population sizes are too large for the test to include all possible members or observations. A
sample should represent the whole population and not reflect bias toward a specific attribute. A sample is a
smaller, manageable version of a larger group.
For example,
1)
To assess the quality of a bag of rice, we examine only a portion of it. The portion selected from the bag
is called a sample, while the whole quantity of rice in the bag is the population,
2)
To estimate the proportion of defective articles in a large consignment, only a portion (i.e., a few of them)
is selected and examined. The selected portion is a sample.
4.2.4.
Census
If detailed information regarding every individual person or item of a given universe (or population) is
collected, then the enquiry will be complete enumerations. Another common name of complete enumeration is
census. For example, during the Census of Population (which is done every ten years in India), information in
respect of each individual person residing in India is collected. This method gives information about each and
every unit of the population with greater accuracy.
Difference between Census and Sample
The main difference between the census and sample method are as follows:
Basis
Meaning
Reliability of
Data
Time Taken
Sampling
Variance
Scope
Field on
Investigation
Homogeneity
Census
Census refers to periodic collection of
information about the populace from the entire
population.
Data from census is reliable and accurate.
Census is very time consuming.
A virtually zero sampling variance, mainly
because the data used is drawn from the whole
population.
All items relating to a universe are
investigated.
Used in investigations with limited field.
Useful where units of the population are
heterogeneous
Sample
Sampling is a method of collecting information
from a sample that is representative of entire
population.
There is a margin of error in data obtained from
sampling.
Sampling is quick.
There is a possibility of sampling variance, since
the data used is drawn from only a small section
of the population.
Only a few items are required.
Used for investigation with large field.
Proves more useful where population units are
homogeneous.
4.2.5.
Sample Frame
A sampling frame is the actual set of units from which a sample will be drawn. It is a list that contains every
member of the population from which a sample will be selected. For example, if we wish to study the
underlying factors that cause patients to be admitted into hospital following an acute asthmatic attack in a given
area (your population), then you would need to know the names of all the people in that area who have been
admitted into hospital for this reason.
A good sampling frame should be:
1) Relevant: It should contain things directly linked to the research topic.
2) Complete: It should cover all relevant items.
3) Precise: It should exclude all the items that are not relevant.
4) Up-to-Date: It should incorporate recent additions and changes, and have redundant items cleansed from
the list.
4.2.6.
Sample Unit
A sampling element is an object on which measurements will be made. A sampling element may or may not be
a sampling unit. When the sampling unit consists of several population units, it is called a cluster of units. If
each population unit in a cluster will be measured, then the sampling elements are the population units within
the sampled clusters. In this case, the sampling element is a subunit of the sampling unit.
The sampling unit must be clearly defined for constructing the sampling frame. By convention in statistics, a
capital N is used to refer to the number of sampling units making up the universe, and a lowercase n for the
number of sampling units in the sample itself.
For example, in a family budget enquiry, usually a family is considered as the sampling unit since it is found to
be convenient for sampling and for ascertaining the required information. In a crop survey, a farm or a group of
farms owned or operated by a household may be considered as the sampling unit.
4.3.
Characteristics of a Good Sample
Following criteria are required for the determination of appropriate sampling design:
Characteristics of a Good Sample
True Representative
Accurate
Approachable
Feasible
Practical
Free from Bias

Comprehensive
Good Size
Goal Orientation
Economical
1) True Representative: A good sample is the true representative of the population corresponding to its
properties. The population is known an aggregate of certain properties and sample is called sub-aggregate
of the universe.
2) Free from Bias: A good sample is free from bias. It does not permit prejudices, pre-conceptions and
imaginations to influence its choice.
3) Accurate: A good sample maintains accuracy. It yields accurate estimates or statistics and does not involve
errors.
4) Comprehensive: A good sample is comprehensive in nature. This is closely related with true
representativeness. A comprehensive sample is organised by specific purpose of the investigation. A sample
may be comprehensive in traits but may not be a good representative of the population.
5) Approachable: The subjects of good sample are easily approachable. The research tools can be easily
administered on them and data can be easily collected.
6) Good Size: The size of good sample is such that it yields an accurate result. The probability of error can be
estimated.
7) Feasible: A good sample creates the research work more feasible.
8) Goal Orientation: This suggests that a sample design should be oriented to the research objectives,
tailored to the survey design, and fitted to the survey conditions. If this is done, it should influence the
choice of the population, the measurement as also the procedure of choosing a sample.
9) Practical: This implies that the sample design can be followed properly in the survey. It is necessary that
complete, correct, practical and clear instructions should be given to the interviewer so that no mistakes are
made in the selection of sampling units and the final selection in the field is not different from the original
sample design. Practically also refers to simplicity of the design, i.e., it should be capable of being
understood and followed in actual operation of the field work.
10) Economical: Economical implies that the objectives of the survey should be achieved with minimum cost
and effort. Survey objectives are generally spelt out in terms of precision, i.e., the inverse of the variance of
survey estimates. For a given degree of precision, the sample design should give the minimum cost.
Alternatively, for a given pen unit cost, the sample design should achieve maximum precision (minimum
variance).
49-3
4.4.
Errors in Sampling
Sampling survey is related to study of limited units of the total population; hence there would be scope for
inaccuracy (or) error in the process of collection, processing and analysis of the data (sample). These errors can
be broadly classified into two types:
1) Sampling errors, and
2) Non-sampling errors.
4.4.1.
Sampling Errors
Sampling errors or variations among sample statistics are due to differences between each sample and the
population and among several samples. Sampling errors originate at the time of collecting samples. The major
cause for error lies in the fact that a researcher depends on a small sample drawn from a large universe to draw
conclusions of its characteristics.
Types of Sampling Errors
These are of two types:
1) Biased Sampling Error: Biased Error arises on account of any bias in selection, estimation etc.
2) Unbiased Sampling Error: In some cases few restrictions may have to be imposed while choosing a
random sample. In such cases one should ensure that such restrictions do not introduce bias in the results.
Causes for Sampling Errors
Sampling errors are primarily due to some of the following reasons:
1) Faulty Selection of the Sample: Some of the bias is introduced by the use of defective sampling technique
for the selection of a sample in which the investigator deliberately selects a representative sample to obtain
certain results.
2) Substitution: If problem arises in enumerating a particular sampling unit included in the random sample,
the investigators usually substitute a convenient member of the population leading to sampling error.
3) Faulty Demarcation of Sampling Units: Bias due to defective demarcation of sampling unit is particularly
significant in area surveys such as agricultural experiments. Thus faulty demarcation could cause sampling
error.
Methods to Reduce the Sampling Errors
Sampling errors can be reduced by:

1) Increasing the Size of the Sample: The sampling error can be reduced by increasing the sample size. If the
sample size n is equal to the population size N, then the sampling error is zero. Increase sample size
because of the square root formula, the standard error is reduced by half if the sample size is quadrupled
(four times), since the sampling error decreases with the increase in sample size. For example, if samples
of 100 produce a standard error of 5%, the sample size must be 400 for 2.5%.
If the samples present are unbiased then the size of the sample is required to be decreased. In many
situations the decrease is inversely proportional to the square root of the sample size.
2) Stratification: When the population contains homogeneous units, a simple random sample is likely to be
representative of the population. But if the population contains dissimilar units, a simple random sample
may fail to be representative of all kinds of units, in the population. To improve the result of the sample,
the sample design is modified. The population is divided into different groups containing similar units.
These groups are called strata. From each group (stratum), a sub-sample is selected in a random manner.
Thus all the groups are represented in the sample and sampling error is reduced. It is called stratifiedrandom sampling. The size of the sub-sample from each stratum is frequently in proportion to the size of
the stratum. Suppose a population consists of 1000 students out of which 600 are intelligent and 400 are
non-intelligent. We are assuming here that we do have this much information about the population. A
stratified sample of size n = 100 is to be selected. The size of the stratum is denoted by N 1 and N2
respectively and the size of the samples from each stratum may be denoted by n 1 and n2. It is written as
under.
Stratum No.
Size of Stratum
Size of Sample from each

Stratum
N1 = 600
N2 = 400
n N1 100 600
60
N
1000
n N 2 100 400
n2
40
N
1000
n1
N1 + N2 = N = 1000
n1 + n2 = n = 100
The size of the sample from each stratum has been calculated according to the size of the stratum. This is called
n 100
1
proportional allocation. In the above sample design, the sampling fraction in the population is
N 1000 10
1
and the sampling fraction in both the strata is also
. Thus this design is also called fixed sampling fraction.
10
This modified sample design is frequently used in sample surveys. But this design requires some prior
information about the units of the population. On the basis of this information, the population is divided into
different strata. If the prior information is not available then the stratification is not applicable
4.4.2.
Non-Sampling Errors
Non-sampling errors occur at the time of observation, approximation and processing of data. This error is
common to both the sampling and census survey. Non-sampling errors can arise at any stage of the planning or
execution of complete enumeration or sample survey. The non-sampling error may be due to faulty sampling
plan, lack of trained and qualified investigators, inaccuracy in responses collected due to bias of the respondent
or the researcher, errors in design of the survey and finally the errors in compilation or publication.
Types of Non-Sampling Errors
Following are the various non-sampling errors:
Types of Non-Sampling Errors
Frame Error
Non-Response Error
Measurement Error
Data Processing Error

Data Analysis Error
1) Frame Error: The sampling frame is the list of all units comprising the population from which a sample is
to be drawn. If the sampling frame is incomplete or inaccurate, its use will give rise to this type of error.
For example, if a survey is to be undertaken to collect information from different sections of the society,
then the use of voters list as a sampling frame, will be inappropriate. This is because young people below
18 years of age will be left out from the survey.
2) Non-Response Error: It is almost impossible to obtain data from each and every respondent covered in the
same. There are always some respondents who refuse to give any information. Thus, non-response error
occurs when respondents refuse to cooperate with the interviewer by not answering his questions. This error
also occurs when respondents are away from home when the interviewer calls on them. In case of mail
survey particularly, the extent of non-response is usually high.
3) Measurement Error: This is caused when the information gathered is different from the information
sought. For example, respondents are asked to indicate whether they own a colour television set. Some of
them may respond in the affirmative just to boost their image before an interviewer, even though they may
not own a colour television set. Such responses will result in measurement error.
4) Data Processing Error: After the data have been collected, they are to be processed. This involves coding
the responses, recording the codes, etc., so that data collection can be transformed into suitable tables.
Mistakes can occur during the processing stage of data.
5) Data Analysis Error: As in the case of data processing, errors can occur on account of wrong analysis of
data. Apart from simple mistakes in summation, division, etc., more complex errors can occur. For
example, the application of a wrong statistical technique can cause such errors.
Causes for Non-Sampling Errors
Some of the more important non-sampling errors arise from the following factors:
1) Errors due to Faulty Planning and Definitions: Sampling error arises due to improper data specification,
error in location of units, measurement of characteristics and lack of trained investigators.
2) Response Errors: These errors occur as a result of the responses furnished by the respondents.
3) Non-Response Bias: Non-response errors occur due to:
i) The respondent is not found after repeated calls.
ii) The respondent is unable to furnish the information on all questions.
iii) The respondent refuses to answer certain questions.
4) Errors in Coverage: These errors occur in the coverage of sampling units.
5) Compiling Errors: The errors arise due to compilation such as editing and coding of responses.
Methods to Reduce the Non-sampling Errors
Non-sampling error can be reduced by:
1) Careful selection of the time the survey is conducted,
2) Using an up-to-date and accurate sampling frame,
3) Planning for follow up of non-respondents,
4) Careful questionnaire design,
5) Providing thorough training for interviewers and processing staff and
6) Being aware of all the factors affecting the topic under consideration.
4.5.
Sample Size
The purpose of research is the main determinant of the level of accuracy required in the results, and this level of
accuracy or exactness is the main determinant of a sample size. In general, the larger the sample size, the more
accurate will be his estimates. In general, the research budget determines the sample size.
The sample size of a statistical sample is the number of observations that constitute it. It is typically denoted
n, a positive integer (natural number).
The sample size n is chosen of that the sampling error (the difference between the statistic and the parameter)
and the standard error of the statistic are fixed at some pre-assigned level.
4.5.1.
Determinants of Sample Size
In addition to the purpose of the study and population size, three determinants usually will need to be specified
to determine the appropriate sample size:
Determinants of Sample Size
Size of the Universe

Resources Available
Degree of Accuracy or Precision
Desired
Homogeneity or Heterogeneity of
the Universe
Nature of Study
Method of Sampling Adopted
Nature of Respondents
The Level of Precision
The Confidence Level

Degree of Variability
1) Size of the Universe: The larger the size of the universe, the bigger should be the sample size.
2) Resources Available: If the resources available are vast a larger sample size could be taken. However, in
most cases resources constitute a big constraint on sample size.
3) Degree of Accuracy or Precision Desired: The greater the degree of accuracy desired, the larger should be
the sample size. However, it does not necessarily mean that bigger samples always ensure greater accuracy.
If a sample is selected by experts by following scientific method, it may ensure better results even when it
is small compared to a situation in which a large sample size is selected by inexperienced people.
4) Homogeneity or Heterogeneity of the Universe: If the universe consists of homogeneous units a small
may serve the purpose but if the universe consists of heterogeneous units a large sample may be inevitable.
5) Nature of Study: For an intensive and continuous study a small sample may be suitable. But for studies
which are not likely to be repeated and are quite extensive in nature, it may be necessary to take a larger
sample size.
6) Method of Sampling Adopted: The size of sample is also influenced by the type of sampling plan adopted.
For example, if the sample is a simple random sample it may necessitate bigger sample size. However, in a
properly drawn stratified sampling plan, even a small sample may give better results.
7) Nature of Respondents: Where it is expected a large number of respondents will not cooperate and send
back the questionnaire, a large sample should be selected.
8) The Level of Precision: The level of precision, sometimes called sampling error, is the range in which the
true value of the population is estimated to be. This range is often expressed in percentage points (e.g., 5
percent). Thus, if a researcher finds that 60% of farmers in the sample have adopted a recommended
practice with a precision rate of 5%, then he or she can conclude that between 55% and 65% of farmers in
the population have adopted the practice.
9) The Confidence Level: The confidence or risk level is based on ideas encompassed under the Central
Limit Theorem. The key idea encompassed in the Central Limit Theorem is that when a population is
repeatedly sampled, the average value of the attribute obtained by those samples is equal to the true
population value. Furthermore, the values obtained by these samples are distributed normally about the true
value, with some samples having a higher value and some obtaining a lower score than the true population
value. In a normal distribution, approximately 95% of the sample values are within two standard deviations
of the true population value (e.g., mean).
In other words, this means that if a 95% confidence level is selected, 95 out of 100 samples will have the
true population value within the range of precision specified earlier. There is always a chance that the
sample you obtain does not represent the true population value.
10) Degree of Variability: The third criterion, the degree of variability in the attributes being measured, refers
to the distribution of attributes in the population. The more heterogeneous a population, the larger the
sample size required to obtain a given level of precision. The less variable (more homogeneous) a
population, the smaller the sample size. A proportion of 50% indicates a greater level of variability than
either 20% or 80%. This is because 20% and 80% indicate that a large majority do not or do, respectively,
have the attribute of interest. Because a proportion of .5 indicates the maximum variability in a population,
it is often used in determining a more conservative sample size, i.e., the sample size may be larger than if
the true variability of the population attribute were used.
4.5.2.
Determination and Selection of Sample Member
The method of determining optimal sample size has been discussed in two estimation problems:
1) Determination of Sample Size n when Estimating the Population Mean.
2) Determination of Sample Size n when Estimating the Population Proportion.
Determination of Sample Size (n) when Estimating the Population Mean
As we know that, for large samples, sample mean x is an unbiased estimator of population mean . The
standard error of x being
. For estimating the sample size we need the following pre-assigned value:
1) The desired confidence level.

2) The Permissible sampling error (E).
3) The population standard deviation ().
So, E x
and standard error of x
For large samples, z
Or ( x ) z
n
x
Suppose the desired confidence level is 95%. Then z values defining 95% confidence level are 1.96, thus
x 1.96
or
or
or
E 1.96
n
1.96
n
E
2
1.96
n
2
z
n
To explain this procedure let us assume the example of a population. Say, a researcher wants to find-out the
average income (in lack) of the population with an accuracy of 0.5 of a lack in income, i.e., the researcher can
tolerate an error of half a lack income on either side of the true average income at 95% confidence level. In
other words, the researcher wants to be 95% confident about his findings.
The formula for confidence limits is: x z
Where, = population mean
x average income calculated from the sample
z = value of z at 95% confidence level
= standard error of x
= standard deviation
n = sample size
If researcher decide to tolerate an error of 1/2 lac that is x 0.5
so
0.5 z
value of z at 95% confidence level are 1.96

so
or
0.5 1.96
1.96
0.5
The value of standard deviation can be found by either

1) Assuming or guessing,
2) Consulting an expert,
3) By pilot study to get the value, and
4) Obtaining from other comparable studies.
Say = 2 lac
Then,
n
1.96 2
7.84
0 .5
n (7.84) 2 61.46
n = 61.5 or 62
Hence 62 respondents are required to constitute a sample at 95% level of accuracy. Similarly it can be
calculated for 99% or others.
Determining Sample Size (n) When Estimating the Population Proportion
is used for estimating population proportion p.
For a sample of size n, sample proportion p
pq
)
The standard error of ( p
(p being known).
n
For large samples,
z
p
p
z
;
)
S.E. ( p
p (sampling error )
Where E p
z pq
n
E
Or
z 2 pq
E2
E
pq
n
4.6.
Steps in Designing the Sample
A sampling plan is a detailed outline of which measurements will be taken at what times, on which material, in
what manner, and by whom. Sampling plans should be designed in such a way that the resulting data will
contain a representative sample of the parameters of interest and allow for all questions, as stated in the goals,
to be answered. The steps involved in developing a sampling plan are:
1) Define the Universe: Universe can be confined to a particular type of
Define the Universe
product, some geographical limits or some other constraints.
The first problem in any sampling procedure is to define the universe.
The target population or universe is the collection of elements or objects
that possess the information sought by the researcher and about which
inferences are to be made. The target population must be defined
precisely. Imprecise definition of the target population will result in
research that is ineffective at best and misleading at worst. Defining the
target population involves translating the problem definition into a
precise statement of who should and should not be included in the
sample. The target population should be defined in terms of elements,
sampling units, extent and time. An element is the object about which or
from which the information is desired. In survey research, the element is
usually the respondent.
Sample Frame
Specifying Sampling Units
Selection of Sample Design
Determination of Sample Size
Select the Sample
Figure 3.1: Steps in Designing the
Sample
For example, consider a marketing research project assessing consumer response to a new brand of mens
cologne. Who should be included in the target population? All men? Men who have used cologne during the
last month? Men 17 or older? Should females be included, because some women buy colognes for their
husbands? These and similar questions must be resolved before the target population can be appropriately
defined.
2) Sample Frame: The frame is constructed either by the researcher for the purpose of his study or may
consist of some existing list of the population.
After the population to be studied has been specified, the next step is to develop a frame of this population.
A list containing all sampling units of a population is known as sampling frame. The frame is constructed
either by the researcher for the purpose of his study or may consist of some existing list of the population. A
frame does not always have to be a list of names; it can also involve a definite location, a boundary, an
address, or a set of rules by which a sampling unit can be delineated.
A frame in some sense is a set of boundaries circumscribing the universe. It may be in the form of lists,
indices, maps, directories, population records, electoral rolls, city tax rolls, students enrolled in a university
etc. In marketing studies the frame is essential. A list of every element of the population appearing once and
only once would constitute a sample frame. A good sampling frame should be accurate, free from
duplication and conveniently available. A sample frame is essential for marketing research and better
performance of sampling procedure.
A sampling frame is a representation of the elements of the target population. It consists of a list or set of
directions for identifying the target population. For example, the telephone book, an association directory
listing the firms in an industry, a mailing list purchased from a commercial organisation, a city directory, or
a map.
3) Specifying the Sampling Units: The decision on sampling unit often depends on the sampling frame. The
sampling unit is the basic unit containing the elements of the population to be sampled, e.g. city blocks,
households, a business organisation etc. The selection of the sampling unit partially depends on the overall
design of the project also. The units which serve as the basis of initial sampling are known as primary
sampling units. It can be composed of one or more units of the population depending on the objectives of
the inquiry.
For example, suppose that Revlon wanted to assess consumer response to a new line of lipsticks and
wanted to sample females over 18 years of age. It may be possible to sample females over 18 directly, in
which case a sampling unit would be the same as an element. Alternatively, the sampling unit might be
households. In the latter case, households would be sampled and all females over 18 in each selected
household would be interviewed. Here, the sampling unit and the population element are different. Extent
refers to the geographical boundaries and the time factor is the time period under consideration.
4) Selection of Sample Design: It is the procedure of selecting units in the sample. There are two basic
methods of sampling namely, probability and non-probability methods which can be further divided into
some specific methods of selection. It is the procedure of selecting units in the sample. A probability sample
is one, where the selected units have some specific chance of being included in the sample. In a nonprobability sample some arbitrary method of selection not depending on chance is adopted. This method
mainly depends on the purpose of the inquiry, as well as on the attitude or convenience of the investigators.
The selection of the sample design really involves two decisions:
i) To use probability or non-probability method of selection, and
ii) Specific sample design to use in collecting the data.
The researchers choice will be affected by the following considerations:
i) If sampling error is to be evaluated, then probability sampling must be used.
ii) To ensure randomness in the selection of units, probability sample should be used.
iii) In the absence of proper sample frame, non-probability sampling should be used.
iv) If time and money considerations are vital, then non-probability sampling should be used.
Once the decision about probability and non-probability method of selection has been made, one should
select the sample design that will best accomplish the objectives of the investigation. Regardless of the
design finally chosen, the researcher may have to defend this design, when the study results are ultimately
presented.
5) Determination of Sample Size: The size of the sample has direct relationship with degree of accuracy
desired in the investigation. It also depends upon the nature of the population as well as the method of
selection. In marketing research investigations the ideal sample size depends upon the type of the series and
the size of the population. It is a common practice that larger the size of the population, more units should
be drawn in the sample and more the degree of heterogeneity, larger should be sample size for it to be
representative.
6) Select the Sample: Select the sample means execute actual sampling process. It is the actual selection of
the sample elements. This requires a substantial amount of office and field work, particularly when personal
interviews are involved. Execution of the sampling process requires a detailed specification of how the
sampling design decisions with respect to the population, sampling frame, sampling unit, sampling
technique, and sample size are to be implemented. If households are the sampling unit, an operational
definition of a household is needed. Procedures should be specified for vacant housing units and for call
backs in case no one is at home. Detailed information must be provided for all sampling design decisions.
4.7.
Types of Sampling Methods: Sample Design
A sample design is a definite plan for obtaining a sample from the sampling frame. Sampling design, in
general, refers to the method or technique the researcher adopts in selecting the sampling units from the frame
or population.
A sample design is the framework, or road map, that serves as the basis for the selection of a survey sample and
affects many other important aspects of a survey as well. In a broad context, survey researchers are interested in
obtaining some type of information through a survey for some population, or universe, of interest. One must
define a sampling frame that represents the population of interest, from which a sample is to be drawn. The
sampling frame may be identical to the population, or it may be only part of it and is therefore subject to some
undercoverage, or it may have an indirect relationship to the population (e. g. the population is preschool
children and the frame is a listing of preschools). The sample design provides the basic plan and methodology
for selecting the sample. A sample design can be simple or complex.
There are different types of sample designs based on two factors viz., the representation basis and the element
selection technique. On the representation basis, the sample may be probability sampling or it may be nonprobability sampling. Probability sampling is based on the concept of random selection, whereas nonprobability sampling is non-random sampling.
On element selection basis, the sample may be either unrestricted or restricted. When each sample element is
drawn individually from the population at large, then the sample so drawn is known as unrestricted sample,
whereas all other forms of sampling are covered under the term restricted sampling.
Thus, sample designs are basically of two types viz., probability sampling and non-probability sampling:
4.7.1.
Random/Probability Sampling Techniques
Probability sampling is also known as Random sampling or chance sampling. Under this sampling design,
every item of the universe has an equal chance of inclusion in the sample. It is, so to say, a lottery method in
which individual units are picked up from the whole group not deliberately but by some mechanical process.
Here it is blind chance alone that determines whether one item or the other is selected. The results obtained
from probability or random sampling can be assured in terms of probability i.e., we can measure the errors of
estimation or the significance of results obtained from a random sample, and this fact brings out the superiority
of random sampling design over the deliberate sampling design.
The various probability sampling methods are as follows:
Random Sampling Techniques
Simple Random Sampling
Stratified Random Sampling
Multi-Stage Sampling
Systematic Sampling
Clusters Sampling
Area Sampling
1) Simple Random Sampling: This is the simplest and most popular technique of sampling. In it each unit of
the population has equal chance of being included in the sample. This method implies that if N is the size of
the population and n units are to be drawn in the sample, then the sample should be taken in such a way that
each of the NCn samples has an equal chance of being selected.
Simple probability sampling gives:
i) Each element in the population an equal chance of being included in the sample and all choices are
independent of each other.
ii) Each possible sample combination an equal chance of being chosen.
The method of simple probability sampling eliminates the chance of bias or personal prejudices in the
selection of units.
2) Systematic Sampling: In this sampling, one unit is selected at random from the universe and the other
units are at a specified interval from the selected unit. This method can be used when the population is finite
and the units of the Universe can be arranged on the basis of any system like alphabetical arrangement,
numerical arrangement or geographical arrangement etc.
3) Stratified Random Sampling: Stratified random sample is one in which random selection is done not from
the heterogeneous universe as a whole but from different homogeneous parts or strata of a universe. This
sampling procedure may be summarised as follows:
i) The universe to be sampled is divided (or stratified) into groups that are mutually exclusive and include
all items in the universe.
ii) A simple random sample is then chosen independently from each group or stratum.
The process of stratified probability sampling differs from simple random sampling in that, with the later,
sample items are selected at random from the entire universe. In stratified random sampling, the sample is
designed so that a separate random sample is selected from each stratum. In simple random sampling the
distribution of the sample among strata is left entirely to chance.
Formally, divide the population into non-overlapping groups (i.e., strata)

N1, N 2, .............N i
Such that
N1, N 2, ............. N i N
Then do a random sample of f
n
in each strata where f is the sampling fraction.
N
4) Cluster Sampling: In this method, the universe is divided into some recognisable sub-groups which are
called clusters. After this a simple random sample of these clusters is drawn and then all the units belonging
to the selected clusters constitute the sample.
For example, if we have to conduct an opinion poll in the city of Delhi, then the city may be divided into,
say, 50 blocks and out of these 50 blocks 5 blocks can be picked up by random sampling and the inhabitants
in these five blocks can be interviewed to give their opinion on a particular issue.
While using this method, it should be seen that clusters are of as small in size as possible and the number of
sample units in each cluster should be more or less the same. This method is commonly used in collecting
data about some common characteristics of the population.
Cluster sampling, no doubt, reduces cost by concentrating surveys in selected clusters. But certainly it is
less precise than random sampling. There is also not as much information in n observations within a
cluster as there happens to be in n randomly drawn observations. Cluster sampling is used only because of
the economic advantage it possesses; estimates based on cluster samples are usually more reliable per unit
cost.
5) Multi-Stage Sampling: This is a modified form of cluster sampling. While in cluster sampling all the units
in a selected cluster constitute the sample, in multistage sampling the sample units are selected in two or
three or four stages. In this system the universe is first divided into first-stage sample units, from which the
sample is selected.
The selected first-stage samples are then sub-divided into second stage units from which another sample is
selected. Third stage and fourth-stage sampling is done in the same manner if necessary. Thus, for an urban
survey, a sample of towns may be taken first and then for each of the selected town a sub-sample of
households may be taken, and then, if need be, from each of the selected household a third-stage-sample of
individuals may be obtained.
6) Area Sampling: Area Sampling is a form of multi-stage sampling in which maps, rather than lists or
registers are used as the sampling frame. It is more frequently used in those countries which do not have a
satisfactory sampling frame such as a population lists.
If clusters happen to be some geographic subdivisions, in that case cluster sampling is better known as area
sampling. In other words, cluster designs, where the primary sampling unit represents a cluster of units
based on geographic area, are distinguished as area sampling. The plus and minus points of cluster sampling
are also applicable to area sampling. The overall area for sampling is divided into several smaller areas
within which a random sample is selected. For example, the city map is used for area sampling. Various
blocks provide the frame and each of them are numbered and used for the sampling. For sampling blocks
stratification is employed, which is based on geographical considerations. Thus blocks are needed to be
identified and then a stratified sample of dwellings can be selected. Finally blocks are subdivided into
segments of a more or less equal size, and a sample of these segments may be taken in the sample.
Nature of Random Sampling
The nature of probability sampling can be described as follows:
1) Accurate Estimates of Population: For some research problems, highly accurate estimates of population
characteristics are required. In these situations, the elimination of selection bias and the ability to calculate
sampling error make probability sampling desirable. However, probability sampling will not always result
in more accurate results. If non-sampling errors are likely to be an important factor, then non-probability
sampling may be preferable, as the use of judgement may allow greater control over the sampling process.
2) Heterogeneous Population: Another consideration is the homogeneity of the population with respect to the
variables of interest. A more heterogeneous population would favour probability sampling, because it would
be more important to secure a representative sample. Probability sampling is preferable from a statistical
viewpoint, because it is the basis of the most common statistical techniques.
3) Sophisticated: However, probability sampling is sophisticated and requires statistically trained researchers.
It generally costs more and takes longer than non-probability sampling. In many marketing research
projects, it is difficult to justify the additional time and expense and thus operational considerations favour
the use of non-probability sampling. In practice, the objectives of the study often exert a dominant influence
on which sampling method will be used.
4) Permits Generalisation: The major advantage of probability sampling is that it permits generalisation, the
process of applying the findings from the sample to the population from which the sample was drawn. As
for the broader population beyond the sampling frame, the researcher can only hypothesise about the
applicability of the sample findings. This is one reason why replication in research is so important, to test
the limits of findings as they apply to additional settings and variations in the population.
Advantages of Random Sampling
Advantages of probability sampling are as follows:
1) Unbiased Estimates: Random (Probability) sampling is the only sampling method that provides
essentially unbiased estimates having measurable precision. If the investigator requires this level of
objectivity, then some variant of probability sampling is essential.
2)
Relative Efficiency: Random Sampling permits the researcher to evaluate, in quantitative terms, the
relative efficiency of alternate sampling techniques in a given situation. Usually this is not possible in nonprobability sampling.
3)
Less Universe Knowledge Required: This requires relatively little universe knowledge. Essentially, only
two things are needed to be known:
i) A way of identifying each universe element uniquely, and
ii) The total number of universe elements.
4) Fair: Every item in the population has an equal chance of being selected and measured.
5)
Easy: It allows easy data analysis and error calculation.
Disadvantages of Random Sampling

Following are the disadvantages of probability sampling:
1) Less Efficient: It is less statistically efficient than other sampling methods.
2) Non-Utilisation of Additional Knowledge: It does not make use of additional knowledge of how the
population is structured.
3) Complex and Time Consuming: The method of selection in many cases can be complex and time
consuming. Especially in the cases of marketing research, the constraints of budget and time may give
preference to non-probability methods of sampling.
4) High Level Skills: Probability sampling requires a very high level of skill and experience for its use.
5) More Time Required: It requires a lot of time to plan and execute a probability sample.
6) High Costs: The costs involved in probability sampling are generally large as compared to non-probability
sampling.
4.7.2.
Non-Random/Non-Probability Sampling Techniques
Non- probability sampling is that sampling procedure which does not afford any basis for estimating the
probability that each item in the population has been included in the sample. Non-probability sampling is also
known by different names such as deliberate sampling, purposive sampling and judgment sampling.
In this type of sampling, items for the sample are selected deliberately by the researcher; his choice concerning
the items remains supreme. In other words, under non-probability sampling the organisers of the inquiry
purposively choose the particular units of the universe for constituting a sample on the basis that the small mass
that they select out of a huge one will be typical or representative of the whole.
The various non- probability sampling methods are:
Non-Random Sampling Techniques
Quota Sampling
Convenience Sampling
Judgment Sampling
Panel Sampling
Snowball Sampling
1) Quota Sampling: One of the most commonly used non-probability sample designs is quota sampling,
which enjoys its most widespread use in consumer surveys. This sampling method also uses the principle of
stratification. As in stratified random sampling, the researcher begins by constructing strata. Bases for
stratification in consumer surveys are commonly demographic, e.g., age, sex, income and so on. Often
compound stratification is used for example, age groups within sex.
This is one kind of purposive or judgement sampling. A quota sample is one in which the investigator is
directed to collect information from an assigned number, or quota of individuals. The quota sampling
technique is very popular in opinion surveys and market studies.
Next, sample sizes (called quotas) are established for each stratum. As with stratified random sampling, the
sampling within strata may be proportional or disproportional. Field-workers are then instructed to conduct
interviews with the designated quotas, with the identification of individual respondents being left to the
field-workers.
2) Convenience Sampling: In convenience sampling selection, the researcher chooses the sampling units on
the basis of convenience or accessibility. It is called accidental samples because the sample-units enter by
accident.
This is also known as a sample of the man in the street, i.e., selection of units where they are. Sample units
are selected because they are accessible. For example, in testing a potential new product, the sample work
is done by adding the new product to the appropriate shops in the locality. Purchasing and selling of the
new product is observed there.
3) Judgment Sampling: A second method of non-probability sampling that is sometimes advocated is the
selection of universe items by means of expert judgment. Using this approach, specialists in the subject
matter of the survey choose what they believe to be the best sample for that particular study.
This type of sample requires judgment or an educated guess as to who should represent the population. It
is expected that these samples would be better as the experts are supposed to know the population. For
example, a group of sales managers might select a sample of grocery stores in a city that they regarded as
representative. This approach has been found empirically to produce unsatisfactory results. And, of
course, there is no objective way of evaluating the precision of sample results. Despite these limitations,
this method may be useful when the total sample size is extremely small.
4) Panel Sampling: Here, the initial samples are drawn on random basis and information from these is
collected on regular basis. It is a semi-permanent sample where members may be included repetitively for
successive studies. Here there is a facility to select and quickly contact such well-balanced samples and to
have relatively high response rate even by mail.
5) Snowball Sampling: It is also known as Referred sampling or Multiplicity sampling. It is a procedure in
which initial respondents are selected randomly but where additional respondents are then obtained from
referrals. It is a form of networking. As the name implies, the sample grows just as a snowball grows.
It is a special non-probability method used when the desired sample characteristic is rare. It may be
extremely difficult or cost prohibitive to locate respondents in these situations. Snowball sampling relies on
referrals from initial subjects to generate additional subjects. While this technique can dramatically lower
search costs, it comes at the expense of introducing bias because the technique itself reduces the likelihood
that the sample will represent a good cross section from the population
For example, Let us suppose that a researcher wants to conduct the survey of NRIs living in USA for the
last 5 years. Initial respondents may be selected from the list supplied by USA Embassy in India. Then
referral procedure was obtained a second group of qualified respondents and so on.
Nature of Non- Random Sampling
The nature of non-probability sampling can be described as follows:
1) Unknown Probability: Selection, i.e., in the case on non-probability sampling, the probability of selection
of each sampling unit is not known. It implies that non-probability samples cannot depend upon the
rationale of the probability theory and hence we cannot estimate population parameters from sample
statistics. Further, in the case of non-probability samples, we do not have a rational way to prove/know
whether the selected sample is representative of the population.
2) Applied in Social Research: In general, researchers prefer probabilistic sampling methods over nonprobabilistic ones, but in applied social research due to constraints such as time and cost and objectives of
the research study there are circumstances when it is not feasible to adopt a random process of selection and
in those circumstances usually non-probabilistic sampling is adopted.
3) Subjective Judgement: A core characteristic of non-probability sampling techniques is that samples are
selected based on the subjective judgement of the researcher, rather than random selection (i.e.,
probabilistic methods), which is the cornerstone of probability sampling techniques. Whilst some
researchers may view non-probability sampling techniques as inferior to probability sampling techniques,
there are strong theoretical and practical reasons for their use.
4) Easier, Quicker and Cheaper Method: Non-probability sampling is often used because the procedures
used to select units for inclusion in a sample are much easier, quicker and cheaper when compared with
probability sampling. This is especially the case for convenience sampling. For students doing dissertations
at the undergraduate and masters level, such practicalities often lead to the use of non-probability sampling
techniques.
Advantages of Non-Random Sampling
1) True Universe Picture: Relevant sections of the universe may be selected in the proportions they appear in
the universe.
2) Economical: Geographical concentration can be achieved thus reducing costs.
3) Quick: Useful and quick method in certain circumstances.
4) Specific Cases Types: Might be only method available, such as if sampling illegal drug users.
5) Specific Members of Population: If researchers are truly interested in particular members of a population,
not the entire population.
6) Pilot Study: Exploratory research attempting to determine whether a problem exists or not, such as a pilot
study.
Disadvantages of Non-Random Sampling
1) Details Needed: Detailed initial information of the universe is needed.
2) Errors: Errors in sample selection can easily occur.
3) Subjective Nature: The subjectivity of non-probability sampling prevents making inferences to the entire
population.
4) Selection Bias: Validity and credibility questionable due to selection bias.
5) Reliability: The reliability of the resulting estimates cannot be evaluated which results in the user not
knowing how much confidence can be placed in any interpretations of the survey finding.
4.7.3.
1)
2)
3)
4)
Random Versus Non-Random Sampling Techniques
Basis
Control
Chances of
Selection
Bias
Economy
Reliability
5) Suitability
6) Usefulness
7) Degree of
Accuracy
8) Sampling
Frame
9) Convenience
4.8.
Random Sampling
Sampling error can be controlled
The selection process depends on the specific
technique and is, therefore, not influenced by
the expertise of the researcher.
Time and costs involved may be high.
It is possible to test the hypotheses through
formal, rigorous tests of significance and, thus,
obtain more reliable results.
More reliable and representative if the
population is heterogeneous.
Preferable if complex, detailed estimates of is
required.
Accuracy may be poor if the population is high.
Formal sampling frames required.
May be very inconvenient if the cheaper
geographical spread of the population is high
and likely to have lower.
Non-Random Sampling
Sampling error cannot be controlled.
Selection bias can be very high.
Usually a low-cost, quicker alternative.

Parametric tests of significance not applicable;
the reliability of results is therefore, not very
high.
May be more useful in a homogeneous
population.
Reasonably useful if parameters to parameters
be estimated are at broad, aggregated levels,
such as market shares or total sales.
Accuracy in such situations is quite scattered.
Can be effective even in the absence of an
elaborate sampling frame.
More convenient, less time-consuming, nonsampling errors.
Need and Importance of Sampling
The importance of sampling is as follows:

1) Saves Time, Money and Effort: The researcher can save time, money and effort because the subjects
involved are small in number giving him a short time to calculate, tabulate, present, analyse, and interpret.
2) More Effective: As the size of sample is less than that of population, fatigue in collecting the information is
reduced and therefore more effective work is done by the investigators.
3) Faster and Cheaper: Since the sample is small, the collection, tabulation, presentation, analysis, and
interpretation of data are rapid and less expense is involved.
4) More Accurate: Fewer errors are made because small data are involved in collection, tabulation,
presentation, analysis and interpretation.
5) Gives More Comprehensive Information: A small sample results in a more thorough investigation of the
study, thus, giving more comprehensive information because all the members of the population have been
given an equal chance of being included in the sample.
5. EXERCISE
To be added later.

Sampling Theory

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sampling Theory

Uploaded by

Copyright:

Available Formats

50-3

Basic Concepts of Sampling

Characteristics of a Good Sample

Free from Bias

Sampling errors can be reduced by:

Size of Sample from each

Data Processing Error

Determinants of Sample Size

Size of the Universe

The Level of Precision

The Confidence Level

Determination and Selection of Sample Member

standard error of x being

1) The desired confidence level.

The formula for confidence limits is: x z

Where, = population mean

x average income calculated from the sample

z = value of z at 95% confidence level

value of z at 95% confidence level are 1.96

The value of standard deviation can be found by either

Steps in Designing the Sample

Types of Sampling Methods: Sample Design

Random/Probability Sampling Techniques

Formally, divide the population into non-overlapping groups (i.e., strata)

Then do a random sample of f

Easy: It allows easy data analysis and error calculation.

Disadvantages of Random Sampling

Non-Random/Non-Probability Sampling Techniques

Random Versus Non-Random Sampling Techniques

Usually a low-cost, quicker alternative.

Need and Importance of Sampling

The importance of sampling is as follows:

You might also like