Mca4020 SLM Unit 11

Probability and Statistics Unit 11
Unit 11 Sampling Theory

Structure:
11.1 Introduction
Objectives
11.2 Population and Sample
Universe or Population
Types of Population
Sample
11.3 Advantages of Sampling
11.4 Sampling Theory
Law of Statistical Regularity
Principle of Inertia of Large Numbers
Principle of Persistence of Small Numbers
Principle of Validity
Principle of Optimisation
11.5 Terms Used in Sampling Theory
11.6 Errors in Statistics
Measures of Statistical Errors
11.7 Types of Sampling
Probability Sampling
Non-Probability Sampling
11.8 Determination of Sample Size
11.9 Central Limit Theorem
11.10 Summary
11.11 Terminal Questions
11.12 Answers
11.1 Introduction
Although sampling is gaining greater acceptance in variety of fields, but still
there are some persons who are accustomed to deal with complete
enumeration and they are suspicious about the sampling results. Their
suspicious might have arisen because of the fact that in the past statistical
methods based on sampling were carelessly used or they were misused.
But the sampling methods are now much advanced as compared to the
past. Now a day, except in the census, where complete enumeration is
made, almost in all the other situations sampling methods are very
Sikkim Manipal University Page No.: 337
frequently used. Infact, sampling is a valuable tool for obtaining the data
quickly, accurately and, above all, cheaply.
In different fields of human activity, the decision making process is based on
the observations of few units which form a portion of the total population.
The process of studying only a portion of the population and making
decisions involves risk, the risk of making wrong decisions. This unit deals
with the various techniques of drawing samples from the population.
When sampling design is not done properly, the estimation or the inferences
drawn from the sample can go wrong and the managerial decisions taken
on the wrong conclusions may lead to loss of time, money and human
resources. This may badly affect the reputation of their organisation. Hence,
the risks involved in using the incorrect sampling design are of primary
concerns to investigators.
Objectives:
At the end of this unit, the student should be able to:
Describe the basic concept of sampling theory and types of sampling
Explain the statistical errors and its measurement
Determine the sample size
11.2 Population and Sample

11.2.1 Universe or Population
Statistical survey or enquiries deal with studying various characteristics of
unit belonging to a group. The group consisting of all the units is called
Universe or Population.
Example: In the statistical survey aimed at determining average per capita
income of the people in the city, all earning individuals in the city form the
population.
11.2.2 Types of Population
The figure 11.1 displays the types of population along with the explanation.

Finite Population A population with finite number of units
Infinite Population A population with infinite number of units
Existent Population A population of concrete objects like books

in the library
Hypothetical Throwing a coin infinite number of times

Population
Fig. 11.1: Types of population
Note: Although many populations appear to be exceedingly large, no truly

infinite population of physical objects actually exists. Given limited resources
and time it is practically not possible to count the number of grains of sand
on the beach. Such populations are termed as infinite population for our
study.
11.2.3 Sample
Sample is a finite subset of a population. A sample is drawn from a
population to estimate the characteristics of the population. Sampling is a
tool which enables us to draw conclusions about the characteristics of the
population. The figure 11.2 illustrates the population and sample.
Population
Sample
Fig. 11.2: Illustration of population and sample

11.3 Advantages of Sampling

The main advantages of the sampling as compared to complete
enumeration are:
1. Reduced cost of the survey.
2. Greater Accuracy or precision of results
3. Saving of time.
4. Adaptability.
It is obvious that the cost of the survey is very much lowered if sampling
methods adopted rather than 100% enumeration because cost per unit
comes the same and the number of units in a sample will be much lower
than the number of units in the population.
Also the sampling results can be more accurate because the source of error
like the training of field workers, clarity of instructions, mistake in location of
units, measurement and their recording, personal biases etc, could be
controlled more effectively by persons of higher caliber.
Sampling results are obtained more quickly since only a part of the
population is studied.
The scope is enlarged, since the personnels are highly trained and
specialized equipment are provided for sampling survey.
In certain cases where complete enumeration is not possible, sampling
methods are the only scientific way of fact finding. For example if we wish to
estimate the proportion of heads in a coin tossing experiment ,then it is not
possible to have a complete count of this hypothetical population; but if we
toss a coin finite number of coin and estimate the population of heads.
Similarly there are situation where the destruction tests are involve and we
have no other way except sampling to collect the information regarding the
population.
For example, if a chemical salt is to be analyzed for the study of it radicals,
only a part of it are a sufficient to tell about its radicals. However, if one does
not believe upon the results given by on a part of the salt he may carry on
applying tests till the whole of it is finished, thus till he certified about the
constituent parts of the salt, the salt itself is finished. Similarly, a drop of
blood from human body will tell the same story as given by whole of the
blood itself. Here again if one is not satisfied by the results of a single drop,
he may experiment with more and more blood, causing the patient to die.
Also to test the life the electric bulbs, manufacturer surely will not burn out
all the bulbs.
In a properly design sample survey it is also possible to make a valid
estimate of error to determine before hand the accuracy or precision and
reliability of the sampling results. But in a complete census it is not possible
to know the margin of uncertainty about the conclusions drawn.
Disadvantage
Inspite of the above advantages, sample survey is not always preferred to
complete enumeration for example if information is require about each unit
of the population, no sampling method is suitable to give the desire
information. Only complete census can do this. Intact, the details of the
universe are scarified through sampling.
11.4 Sampling Theory

The sampling theory is based on the following five important laws. The
figure 11.3 shows the five important laws of sampling theory
Law of statistical regularity
Principle of inertia of large numbers
Principle of persistence of small numbers
Principle of validity
Principle of optimisation
Fig. 11.3: Laws of sampling

11.4.1 Law of Statistical Regularity

The law of statistical regularity states that a group of units chosen at random
from a large group tends to posses the characteristics of that large group.
Suppose, a particular characteristic of the population has a particular shape,
then the same characteristics will also follow the same shape in the sample.
11.4.2 Principle of Inertia of Large Numbers
This principle states that other things being equal, as the sample size
increases, the results tend to be more reliable and accurate. Suppose that
the population mean is 25 units. If a sample size of 50 results in average of
24.5 units, then larger sample size of 100 will result in 24.8 units. In other
words, larger the sample size, more accurate will be the result.
11.4.3 Principle of Persistence of Small Numbers
If some of the units in a population possess markedly distinct
characteristics, then it will be reflected in the sample values also. For
example, if there are 300 blind persons in a population of 10,000 persons,
then a sample of hundred will have more or less same proportion of blind
persons in it.
11.4.4 Principle of Validity
A sampling design is said to be valid if it enables us to obtain tests and
estimation about population parameters.
11.4.5 Principle of Optimisation
This principle aims at obtaining a desired level of efficiency at minimum cost
or obtaining maximum possible efficiency with given level of cost.
11.5 Terms Used in Sampling Theory

Parameter: Any statistics, like mean, median, calculated from population
values are known as parameters of the population and denoted by Greek
letters (, and so on).
Statistics: Any statistics calculated from the sample are known as statistic
and are denoted by English letters ( x , s and so on). Statistic is the
parameter of a sample.

Sampling distribution: Sampling distribution consists of all the possible

values of a statistic and their respective probabilities for a given sample
size.
Key Statistic
The standard deviation of sampling distribution of any statistic is called
standard error of that statistic. It is denoted as S and is given by:
2 2
fX fX
S2

f f
where, f is the frequency and X is the mean.

Example: Consider the selection of two numbers from the given five
numbers (1, 2, 3, 4, 5). Find the possible combinations and their mean.
Solution: The possible combinations and their average are represented in
table
Possible combinations of given 5 numbers and their average
Combinations Numbers Selected Average
1 1,2 1.5
2 1,3 2
3 1,4 2.5
4 1,5 3
5 2,3 2.5
6 2,4 3
7 2,5 3.5
8 3,4 3.5
9 3,5 4
10 4,5 4.5
This gives the means of sample size 2. We form a distribution of sample
means which can be represented in table below

Frequency table
X f 2
fx fx
Mean Frequency
1.5 1 1.5 2.25

2 1 2.0 4.00
2.5 2 5.0 12.50
3 2 6.0 18.00
3.5 2 7.0 24.5
4 1 4.0 16.0
4.5 1 4.5 20.25
N 10 30 97.50
Mean of the sampling distribution = fx / N =30/10= 3

Mean of the population is (1 + 2 + 3 + 4 + 5) / 5 = 3
The above table represents the sampling distributions of means. We
observe that the mean of sample means is equal to population mean.
Noye: The standard error or the standard deviation of sample means of
above example is given by:
2 2
2 fx fx 97.50
S (3)2
f f 10
S 0.7500 0.866
Hence, the standard error of the mean S is 0.866.

Uses of standard error
Standard error helps us in:
i) Testing of hypothesis
ii) Constructing confidence interval for the statistics
iii) Giving reliability measure for the statistic by its reciprocal value

11.6 Errors in Statistics

The term error denotes the difference between true value of population
parameter and its estimate provided by sampling technique. Therefore, the
term error in statistics is not referred in its ordinary sense. There are four
types of errors as shown in the figure 11.4.
Fig. 11.4. Errors in Statistics

Let us understand about each of the error types and the factors causing
those errors.
Sampling errors
The sample results are bound to differ from population results, since sample
is only a small portion of the population. It is also known as inherent error
and cannot be avoided. It is not worth to eliminate them completely. These
errors may be due to the following factors:
Faulty selection of sample
Substitution of units to be studied
Faulty demarcation of sampling units
Error due to bias in estimation
However, the sampling errors follow random or chance variations and tend
to cancel out each other on averaging.
Non-sampling errors
Non-sampling errors are attributed to factors that can be controlled and
eliminated by suitable actions. It is worth to eliminate these errors. They are
due to the following factors:
Faulty planning, faulty definitions
Defective methods of interviewing

Personal bias of investigator

Lack of trained and qualified investigators
Respondents failure to answer
Improper coverage
Compiling errors
Publication errors
Biased errors
It arises in both census and sampling method. These errors occur due to
personal bias of the investigator and the instruments used for measuring.
They are also due to faculty collection of data, respondents bias and bias
due to non-response. Biased errors have a tendency to grow with sample
size. Therefore, they are also known as cumulative errors. The magnitude of
biased errors is directly proportional to the sample size.
Unbiased errors
The errors that are due to over-estimation and under-estimation such that
they are equal are known as unbiased errors. They are also known as
compensatory errors. They do not increase with sample size.
11.6.1 Measures of Statistical Errors

Absolute error is the difference between true value t and the observed
value a. Symbolically, absolute error AE is represented as:
AE t a
It is independent of magnitude of the actual value.

Relative error is the ratio of the absolute error to the actual value. It is
symbolically represented as:
AE t - a
RE
a a
It provides a degree of error for comparison purposes between different sets
of data.

Self Assessment Questions

1. State whether the following statements are true T or false F.
i) Population is aggregate of objects under study.
ii) Sampling method consume time and resources.
iii) Any summarised figure from population is known as statistics.
iv) We adopt sampling technique in our activities.
v) Population is a subset of sample.
vi) An unbiased sample gives an accurate prediction of characteristics
of an entire population.
vii) The standard deviation of sampling distribution of a statistic is
known as standard error of that statistic.
viii) Standard error is used as a reliability measure.
ix) Faulty selection of sample contributes to sampling error.
x) Personal bias increases the non-sampling errors.
xi) Unbiased errors are cumulative in nature.
xii) Biased errors are also known as compensatory errors.
11.7 Types of Sampling

By choosing a sample technique carefully, errors can be minimised. Let us
take a look at the different techniques available. The sampling techniques
may be broadly classified into.
i) Probability Sampling
ii) Non-Probability Sampling
11.7.1 Probability Sampling
Probability sampling provides a scientific technique of drawing samples from
the population. The technique of drawing samples is according to the law in
which each unit has a predetermined probability of being included in the
sample. The different ways of assigning probability are:
i) each unit is assigned with the same chance of being selected.
ii) sampling units assigned with varying probability depending on
priorities
iii) units are assigned with probability proportional to the sample size
We will discuss here some of the important probability sampling designs.

Simple Random Sampling

Under this technique, sample units are drawn in such a way that each and
every unit in the population has an equal and independent chance of being
included in the sample. If a sample unit is replaced before drawing the next
unit, then it is known as Simple Random Sampling With Replacement
[SRSWR]. If the sample unit is not replaced before drawing the next unit,
then it is called Simple Random Sampling without replacement [SRSWOR].
In first case, probability of drawing a unit is 1/N, where N is the population
size. In the second case probability of drawing a unit is 1/Nn.
The selection of simple random sampling can be done by:
Lottery method: In lottery method, we identify each and every unit with
distinct numbers by allotting an identical card. The cards are put in a
drum and thoroughly shuffled before each unit is drawn.
The use of table of random numbers: There are several random number
tables. They are Tippets random number table, Fishers and Yates
Tables, Kendall and Babington Smiths random tables, Rand Corporation
random numbers and so on. The table below represents the specimen
of random numbers by Tippetts.
Tippetts random number table
2952 6641 3992 9792 7979 5911 3170 5624
4167 9524 1545 1396 7203 5356 1300 2693
2370 7483 3408 2762 3563 1089 6913 7691
0560 5246 1112 6107 6008 8126 4233 8776
2754 9143 1405 9025 7002 6111 8816 6446
Suppose, we want to select 10 units from a population size of 100. We
number the population units from 00 to 99. Then we start taking 2 digits.
Suppose, we start with 41 (second row) then the other numbers selected will
be 67, 95, 24, 15, 45, 13, 96, 72, 03.
Example: If we want to select 10 students out 30 students in a class. Then
number the students from 00 to 29. Then from the random number table
choose two digit number. In the Table 7.2 we start from the third row. The
first number selected is 23, which lies between 00 and 29. So the 23rd
student is selected as the first ubit of the sample. The second number is 70,
but it greater than 29 we cannot choose that number. The bold numbers in
table 7.2 are the selected sample that is the numbers selected are 23, 08,
27, 1013, 05, 11, 12, 07, 08. The corresponding students constitute required
sample.
Stratified Random Sampling: This sampling design is most appropriate if
the population is heterogeneous with respect to characteristic under study or
the population distribution is highly skewed.
We subdivide the population into several groups or strata such that :
i) Units within each stratum is more homogeneous
ii) Units between strata are heterogeneous
iii) Strata do not overlap, in other words, every unit of population belongs
to one and only one stratum
The criteria used for stratification are geographical, sociological, age, sex,
income and so on. The population of size N is divided into K strata
relatively homogenous of size N1, N2.Nk such that N1 + N2
+ + Nk = N. Then, we draw a simple random sample from each
stratum either proportional to size of stratum or equal units from each
stratum.
Merits and demerits of stratified random sampling
Merits Demerits
1. Sample is more representative 1. Many times the stratification is not
effective
2. Provides more efficient estimate 2. Appropriate sample sizes are not
drawn from each of the stratum
3. Administratively more convenient

4. Can be applied in situation where
different degrees of accuracy is
desired for different segments of
population
Example 2
The items produced by factories located at three cities X, Y and Z are
200, 300 and 500 respectively. We wish to draw a sample of 20 items under
proportional stratified sampling. We number the unit from 0 to 999. Then
refer to random table and select the numbers as represented in table below

Stratified random sampling
27717 43584 85192 88977 29490 69714 94015 62874

32444 48277 13025 14338 54066 15423 47724 66733
74108 82228 888570 74015 80217 36292 98525 24335
24432 24896 62880
Proportion of samples to be selected are:
200
For Factory X 20 4
1000
300
For Factory Y 20 6
1000
500
For Factory Z 20 10
1000
Total = 20
For first factory sample units selected are 174, 192, 069, 156.
For second factory sample units selected are 287, 432, 444, 482, 302, 254.
For third factory sample units selected are 854, 772, 733, 741, 822, 853,
570, 802, 629, 525.
Systematic Sampling
This design is recommended if we have a complete list of sampling units
arranged in some systematic order such as geographical, chronological or
alphabetical order.
Suppose the population size is N. The population units are serially
numbered 1 to N in some systematic order and we wish to draw a sample
of n units. Then we divide units from 1 to N into K groups such that each
group has n units.
This implies nK = N or K = N/n. From the first group, we select a unit at
random. Suppose the unit selected is 6th unit, thereafter we select every 6 +
Kth units. If K is 20, n is 5 and N is 100 then units selected are 6, 26, 46,
66, 86.

Merits and demerits of systematic sampling

Merits Demerits
1. Very easy to operate and easy to 1. Many case we do not get up-to-
check. date list.
2. It saves time and labour. 2. It gives biased results if periodic
feature exist in the data.
3. More efficient than simple random
sampling if we have up-to-date
frame.
Example: If there are 100 units in a population serially numbered from 1 to

100 and we want to draw sample of 5 units. Therefore we have
k=N/n=100/5=20. Let the first number selected randomly be 8. Then
selected units in serial numbers are 8, 28, 28, 68 and 88.
Cluster Sampling
The total population is divided into recognisable sub-divisions, known as
clusters such that within each cluster units are more heterogeneous and
between clusters they are homogenous. The units are selected from each
cluster by suitable sampling techniques. The figure 11.5 represents the
cluster sampling where each packet of candy packet forms a cluster.
Fig. 11.5: Cluster sampling

Multi-stage Sampling
The total population is divided into several stages. The sampling process is
carried out through several stages. It is represented as in figure 11.6.
Fig. 11.6: Multistage sampling

Example: We want to select 1000 colleges from southern states. In the first
stages we may select any three states. In the second stage we may select
some districts in that state. In the 3rd stage, we may select the colleges in
each district. We may adopt any sampling technique at each stage.
Merits and demerits of multi stage sampling
Merits Demerits
Greater flexibility in sampling Estimates are less accurate
method
Existing division can be used Investigator should have knowledge of the
entire population that will be sampled
11.7.2 Non-probability Sampling

Depending upon the object of enquiry and other considerations a
predetermined number of sample units is selected purposely so that they
represent the true characteristics of the population.
A serious drawback of this sampling design is that it is highly subjective in
nature. The selection of sample units depends entirely upon the personal
convenience, biases, prejudices and beliefs of the investigator. This method
will be more successful if the investigator is thoroughly skilled and
experienced.
Judgment Sampling
The choice of sample items depends exclusively on the judgment of the
investigator. The investigators experience and knowledge about the
population will help to select the sample units. It is the most suitable method
if the population size is less. The table below displays the merits and
demerits of judgement sampling.
Merits and demerits of judgement sampling
Merits Demerits
1. Most useful for small population 1. It is not a scientific method.
2. Most useful to study some unknown 2. It has a risk of investigators
traits of a population some of whose bias being introduced.
characteristics are known.
3. Helpful in solving day-to-day
problems.

Convenience Sampling
The sample units are selected according to convenience of the investigator.
It is also called chunk which refers to the fraction of the population being
investigated which is selected neither by probability nor by judgment.
Moreover, a list or framework should be available for the selection of the
sample. It is used to make pilot studies. However, there is a high chance of
bias being introduced.
Quota Sampling
It is a type of judgment sampling. Under this design, quotas are set up
according to some specified characteristic such as age groups or income
groups. From each group a specified number of units are sampled
according to the quota allotted to the group. Within the group the selection
of sample units depends on personal judgment. It has a risk of personal
prejudice and bias entering the process. This method is often used in public
opinion studies.
11.8 Determination of Sample Size

Sample size depends upon the size of the population; the resources
available, the degree of accuracy desired, homogeneity of the population,
nature of study, methods of sampling used and nature of respondents. The
following are the formulae available to determine sample size. When the
study is concerned with population proportion or population mean.
Note:
1. The formula used for calculating the sample size while research is
concerned with population proportion and finite population, is given by:
P Ps
Z (For finite population )
N - n / N - 1 PQ / n
where, N is population size.

z 2 pqN
n (in case of finite population)
e 2 ( N 1) z 2 pq
Z = value correspond to the degree of confidence desired
P = Population proportion,

Ps = Sample proportion which implies P - Ps error we admit in the

result
Q=1P
n = Sample size.
2. The formula used for calculating the sample size while research is
concerned with population proportion and infinite population, is given
by:
P Ps
Z (For infinite population )
PQ / n
z 2 PQ
n
e2
where,
Z = value correspond to the degree of confidence desired
P = Population proportion,
Ps = Sample proportion which implies P - Ps error we admit in the
result
Q=1P
n = Sample size.
3. The formula used for calculating the sample size for infinite population,
when population mean and sample mean are given, is:

Z s (For infinite population )

n
z 2 2
n
e2
where,
= Population mean
s = Sample mean

e is the error we admit between the true value of

s
parameter and the statistic (estimated value).
= Standard deviation of population
n = Sample size
Example: The mean expenditure of per customer is at a tire store is Rs
85.00, with a standard deviation of Rs 9.00. If the mean expenditure of the
sample is Rs 87, what is the reuired sample size? (z-value is 1.41)
Solution: Given =85, s=87, =9 and z=1.41, e 85 87 2

s
Then we have,
z 2 2 (1.41) 2 92 1.9881 81
n 2 40.25 40.
e 22 4
Hence the required sample size is 40.
Note: The formula used for calculating the sample size for finite population,
when population mean and sample mean are given, is:

Z s (For finite population )
N n
n N 1
z 2 2 N
n
( N 1)e 2 z 2 2
where,
= Population mean
s = Sample mean
e is the error we admit between the true value of

s
parameter and the statistic (estimated value).
= Standard deviation of population
n = Sample size
N = Size of population
Example: A production companys 350 hourly employees average 37.6

years of age, with a standard deviation of 8.3. If the sample average is 40
years of age and z-value is 2.07 , calculate the required sample size.
Solution: Given N=350, =37.6, s=40, =8.3 and z=2.07,
e 37.6 40 2.4
s
Then the sample size is given by,
z 2 2 N
n
( N 1)e 2 z 2 2
(2.07) 2 (8.3) 2 350

(350 1) (2.4) 2 (2.07) 2 (8.3) 2
103315.4
44.8 45.
2305.4
Hence the required sample size is 45.
Note: The formula used for calculating the sample size, when mean of
sample means is given, is:

x
n
where,
= Mean of sample means
x
= Population standard deviation
n = Sample size
11.9 Central Limit Theorem

If X1, X2Xn is a random sample of size n from any population, then
the sample mean (X) is normally distributed with mean and variance 2 /
n provided n is sufficiently large.
From the central limit theorem, we infer the following.
i) The mean of the sampling distributions will be equal to the population
mean

ii) The sampling distribution of the mean approaches normal distribution

as the sample size increases
iii) It permits us to use sample statistics to make inferences about
population parameters irrespective of the shape of frequency
distribution of the population.
2. State whether the following statements are true T or false F.
i) Sample in which units are selected by judgment is known as
probability sample.
ii) Judgment sampling does not give representativeness of a sample.
iii) Large sample size always results in minimising the standard error.
iv) A sampling plan that divides the population into well-defined
groups from which random samples are drawn is known as cluster
sampling.
v) The principles of simple random sampling are the theoretical basis
for statistical inference.
vi) If the mean of a certain population is 20, it is likely that most of the
sample means will be 20.
vii) Any sampling distribution can be totally described by its mean and
standard deviation.
viii) Sampling from infinite population and from a finite population with
replacement results in:

x n
ix) The central limit theorem assures that the sampling distribution of
mean is always normal.
x) Stratified sampling is used when each group considered are more
homogenous within itself and heterogeneous between group.
11.10 Summary
There are two methods of studying the characteristics of population, census
and sampling. The various advantages of sampling and the various errors
that could prop up in using these methods were explained.

Mainly, there are two methods of sampling namely; probability sampling and
non-probability sampling. The merits and demerits of each sampling method
were explained. We discussed the procedure for determining sample size.
We concluded the chapter with the importance of central limit theorem.
11.11 Terminal Questions

1. Discuss the errors that arise in statistical survey.
2. Describe simple random sampling.
3. Describe systematic sampling.
4. What is quota sampling and when do we use it?
5. What are the basic principles on which sampling theory is based?
6. Explain about the sampling distributions of a static and its standard
error.
7. Discuss the uses of standard error.
8. The distribution of employees in three plants of a manufacturing unit is
as shown in table below. Using random numbers discussed under topic
Simple random sampling, draw a random sample of size 15.
Distribution of employees in three manufacturing plants
Plant A B C
Number of employees 100 200 200
9. Population proportion of tea drinkers is 0.6. Determine the sample size

such that the error between actual and observed proportion will be less
than or equal to 0.05 with 95% confidence, (Z = 1.96).
10. The standard error of mean of bursting strength of card boards
produced by a company is 1.5 units. If the population standard deviation
is 50 , find the sample size.
11.12 Answers
1. i- T, ii- F, iii- T, iv- T, v- F, vi- T, vii- T, viii - T, ix- T, x- T, xi- F, xii- F
2. i- F, ii- T, iii- T, iv- F, v- T, vi- F, vii- F, viii - T, ix- T, x- T
T denotes True
F denotes False
Terminal Questions
1. Refer section 11.6
2. Refer section 11.7.1
9. The sample size is approximately 19.
10. The sample size is approximately 23.

Mca4020 SLM Unit 11

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mca4020 SLM Unit 11

Uploaded by

Copyright:

Available Formats

Probability and Statistics Unit 11

Unit 11 Sampling Theory

11.2 Population and Sample

Sikkim Manipal University Page No.: 338

Finite Population A population with finite number of units

Infinite Population A population with infinite number of units

Existent Population A population of concrete objects like books

Hypothetical Throwing a coin infinite number of times

Fig. 11.1: Types of population

Note: Although many populations appear to be exceedingly large, no truly

Fig. 11.2: Illustration of population and sample

11.3 Advantages of Sampling

11.4 Sampling Theory

Fig. 11.3: Laws of sampling

Sikkim Manipal University Page No.: 341

11.4.1 Law of Statistical Regularity

11.5 Terms Used in Sampling Theory

Sikkim Manipal University Page No.: 342

Sampling distribution: Sampling distribution consists of all the possible

where, f is the frequency and X is the mean.

Sikkim Manipal University Page No.: 343

1.5 1 1.5 2.25

Mean of the sampling distribution = fx / N =30/10= 3

Hence, the standard error of the mean S is 0.866.

Sikkim Manipal University Page No.: 344

11.6 Errors in Statistics

Fig. 11.4. Errors in Statistics

Sikkim Manipal University Page No.: 345

Personal bias of investigator

11.6.1 Measures of Statistical Errors

It is independent of magnitude of the actual value.

Sikkim Manipal University Page No.: 346

Self Assessment Questions

11.7 Types of Sampling

Sikkim Manipal University Page No.: 347

Simple Random Sampling

3. Administratively more convenient

Sikkim Manipal University Page No.: 349

Stratified random sampling

27717 43584 85192 88977 29490 69714 94015 62874

Proportion of samples to be selected are:

Sikkim Manipal University Page No.: 350

Merits and demerits of systematic sampling

Example: If there are 100 units in a population serially numbered from 1 to

Fig. 11.5: Cluster sampling

Fig. 11.6: Multistage sampling

Sikkim Manipal University Page No.: 351

11.7.2 Non-probability Sampling

Sikkim Manipal University Page No.: 352

11.8 Determination of Sample Size

where, N is population size.

Sikkim Manipal University Page No.: 353

Ps = Sample proportion which implies P - Ps error we admit in the

Sikkim Manipal University Page No.: 354

e is the error we admit between the true value of

Solution: Given =85, s=87, =9 and z=1.41, e 85 87 2

e is the error we admit between the true value of

Example: A production companys 350 hourly employees average 37.6

Then the sample size is given by,

11.9 Central Limit Theorem

Sikkim Manipal University Page No.: 356

ii) The sampling distribution of the mean approaches normal distribution

Sikkim Manipal University Page No.: 357