Sampling

Submitted by:
HIMANI KALRA
MBA (GENERAL)
35
SUBMITTED TO:
DR. SIMMI
UNIVERSITY SCHOOL OF MGT. ,
K.U.K
Chapter 11 1
Sampling
Concept of sampling
Aims of Sampling
Merits and demerits of sampling
Types of sampling methods
Sampling errors
Sampling Distributions
Probability Distributions
The Central Limit Theorem
The method of selecting out of a given

population is called sampling.
Sampling method has three main
stages:
To select a sample.
To collect information from it.
To make inferences regarding the
characteristics of population.
Reduces time & cost of researcher (e.g.

political polls)
Generalize about a larger population
(e.g., benefits of sampling city
neighborhood)
In some cases (e.g. industrial
production) analysis may be
destructive, so sampling is needed
Saving of time & money
Intensive study
Organizational convenience
More reliable results
More scientific Methods
Less accurate
Wrong conclusions
Less reliable
Need of specified knowledge
(A)
Probability sampling methods
(4)
Simple random sampling

Stratified random sampling
Systematic random sampling
Cluster sampling
(A)
Non-profitability sampling methods
(1)
(2)
(3)
(1)
(2)
(3)
(4)
Convenience sampling
Quota sampling
Judgment sampling
snowball sampling
Every subset of a specified size n from the population has an equal

chance of being selected
Math
Alliance
Project
1. Get a list or sampling frame

a. This is the hard part. It must not
systematically exclude any one.
2. Generate random numbers.
3. Select one persons per random numbers.
The population is divided into two or more groups called strata,

according to some criterion, such as geographic location, grade level,
age, or income, and subsamples are randomly selected from each
strata.
Math
Alliance
Project
a.
b.
Select a random number, which will be known as k.

Get a list of people , or observe a flow of people (pedestrians
on a corner).
Select every kth person
Carefull that there is no systematic rhythm to the flow or list of
people.
If every 4th person on the list is , say rich or sincere or some
other consistent method avoid this method.
Every kth member ( for example: every 10th person) is

selected from a list of all population members.
Math
Alliance
Project
The population is divided into subgroups (clusters)

like families. A simple random sample is taken of
the subgroups and then all members of the cluster
selected are surveyed.
Math
Alliance
Project
Selection of whichever individuals are

easiest to reach
It is done at the convenience of the
researcher
Math
Alliance
Project
In it selection criteria based on

personal judgement that the element is
representative of the population under
study.
This is used primarily when there is
limited number of people that have
expertise in the area being researched.
In it selection of additional respondents

is based on refferals from the initial
respondents.
Determine what the population looks

like in terms of specific qualities.
Create quota based on those qualities.
Select people for each quota.
1.
2.
3.
4.
Sampling errors are those which arise due to the method of

sampling.they occur due to
faulty selection of sampling methods.
Substituting one sample for the other sample due to the difficulties
in collecting the samples.
Faulty demarcation of sampling units.
Variability of population which has differentcharateristics.
1.
2.
3.
4.
5.
6.
7.
8.
These are those which creep in due to human factors which always
varies from one investigator to another. These errors arise due to
Faulty planning.
Faulty selection of the sample units.
Lack of trained staff.
Negligence on the part of respondents.
Errors in compilations.
Errors due to wrong statistical measures.
Framing of wrong questionaires.
Incomplete investigation of the sample surveys.
Population - Entire group of items/individuals we want

information about.
Sample - The part of the population we actually examine in

order to gather information.
A parameter is a number that describes the population. It is

fixed, but we dont know its value.
A statistic is a number that describes a sample. Its value is

known, but it varies from sample to sample.
We often use statistics to estimate the unknown parameter
Statistical inference draws conclusions about a population on the basis

of data from a sample.
It also provides us with a statement of how much confidence we can

place in our conclusions.
We are in many cases interested in the mean value a variable takes in

the population.
Individual scores are random draws from a population
The sample mean is a guess about the true population mean
But how accurate (or efficient) is the sample mean?
Or, I could say, what is the standard deviation of the sample mean
I want to estimate the SD of the mean of n observations, i.e., how

much the mean is expected to vary from sample to sample
But I only get to observe one sample
Imagine that you could draw a sample and calculate a mean or

median or SD or whatever statistic again and again from a
population.
What would that distribution of this statistic look like?
Youre conceptualizing a sampling distribution.
What is its expected value and standard deviation?
If you know this, you can answer how likely it is that a sample
with a given mean (or median or SD) was drawn from a
population with known mean (or median or SD)
is a distribution of sample statistics (means, medians, etc.)
is a theoretical distribution that describes all possible means,

medians, etc., and the probability of obtaining each value.
can be visualized using simulations, but must be imagined when

collecting real data.
1.
They are approximately normal

When data in population are normally
distributed and even if they are not,
assuming large n
2.
They are centered at of the

population they are drawn from
Mean is unbiased
3.
Their standard deviation equals the

standard deviation of the individual
scores divided by the square root of
the sample
size (standard error of
SEM
X
themean)
n
100; 15
Assume IQ:
Sampling Distribution of Sample Means if n = 25
E ( X ) 100; X
15
3
25
Normal
Normal
E ( X ) 100; X
15
1.5
100
Normal
15
E ( X ) 100; X
0.75
400
n = 25
n = 100
n = 400
X 103 100
1.00
X
3
p = .1587
X 103 100
2.00
X
1.5
p = .0228
z103
z103
z103
X 103 100
4.00 p < .0001

X
.75
P-value= 0.05 level
0.6
0.5
Probability
0.4
n = 25
n = 100
n = 400
0.3
0.2
0.1
0
90.0
91.0
92.0
93.0
94.0
95.0
96.0
97.0
98.0
99.0 100.0 101.0 102.0 103.0 104.0 105.0 106.0 107.0 108.0 109.0 110.0
Sample Means
What Z score in a normal distribution separates the most extreme 5%

of the scores from the middle-most 95% of the scores?
1.96
n = 25; standard error of the mean = 3.00
X X 100
1.96
X
3
X 100 1.96 3.00 94.12
X 100 1.96 3.00 105.88
0.6
0.5
Probability
0.4
n = 25
0.3
0.2
0.1
0
90.0
91.0
92.0
93.0
94.0
95.0
96.0
97.0
98.0
99.0
100.0 101.0 102.0 103.0 104.0 105.0 106.0 107.0 108.0 109.0 110.0
Sample Means

1.96
n = 100, standard error of the Mean = 1.50
X X 100
1.96
X
1.50
X 100 1.96 1.50 97.06
X 100 1.96 1.50 102.94
0.6
0.5
Probability
0.4
n = 100
0.3
0.2
0.1
0
90.0
91.0
92.0
93.0
94.0
95.0
96.0
97.0
98.0
99.0 100.0 101.0 102.0 103.0 104.0 105.0 106.0 107.0 108.0 109.0 110.0
Sample Means

1.96
n = 400 , standard error of the Mean = 0.75
X X 100
1.96
X
.75
X 100 1.96 .75 98.53
X 100 1.96 .75 101.47
0.6
0.5
Probability
0.4
n = 400
0.3
0.2
0.1
0
90.0
91.0
92.0
93.0
94.0
95.0
96.0
97.0
98.0
99.0 100.0 101.0 102.0 103.0 104.0 105.0 106.0 107.0 108.0 109.0 110.0
Sample Means
Deriving the standard error of

the mean
This section is for your own edification regarding why
SEM = SD/sqrt(n).
You will not be tested on it.
If X is a random variable, var(X) is its variance

Sum of two variables X1 and X2 = X1 + X2
Variance sum law:
Var(X1 + X2) = var(X1) + var(X2)
Var(X1 - X2) = var(X1) + var(X2)
Constant multiplication rule:
If I multiply a random variable X by 2, I get 2X
Var(2X) = 22 * var(X)
Var(aX) = a2var(X)
Imagine I measure two subjects x1 and x2

They are drawn from random variables X1 and X2,
respectively
I assume they come from identical distributions
Their mean, or the sample mean, is (X1 + X2) / 2
What is the variance of that sample mean?
This tells me how accurate the sample mean is.
Why? Sqrt(var) = st. deviation = how far off the true mean I
typically am
X1 X2
find : var(
)
2
Variance of sampling distribution (for mean)
Assume independent
X1 and X2!
X1 X2
var(
) (1/2) 2 (var(X1) var(X2))
2
var(X1) var(X2) var(X)
Assume X1 & X2 have

identical distribution, with
same variance!
X1 X 2
2 var( X )
2
var(
) (1/ 2) * 2* var( X )
2
4
define : var(X1) var(X2) 2X
X1 X2 2 X
var(
)
2
2
Variance of sampling
distribution for mean of 2
subjects
Standard deviation of sampling distribution (for mean)
X1 X2 2 X
var(
)
2
2
X1 X 2
2X X
SD(
)
Std. of mean of 2 variables

2
2
2
X 1
X1 X2...
1
1
var(
N2
var(X1 X2 ...X N )
N2
var(X1) var(X2) ...var(X ) N

N
N N
2
Std. of mean of n variables
X 1 X 2... Xn X
SD(
)
n
n
Each subject is a random

variable
--> n subjects
This means
If
, or s, is our estimate of the sample standard deviation
(average deviation of an individual from the sample mean)
Is our estimate of how far off the sample mean is, on average, from
the true population mean
This is the standard error of the mean

our estimate of the standard deviation of the sampling distribution of means
The standard deviation of the sampling

distribution is called the standard error
No matter what we are measuring, the

distribution of any measure across all
possible samples we could take
approximates a normal distribution, as
long as the number of cases in each
sample is about 30 or larger.
If we repeatedly drew samples from a population and

calculated the mean of a variable or a percentage or, those
sample means or percentages would be normally
distributed.
The Central Limit Theorem

Standard error can be estimated from a single sample:
Where
s is the sample standard deviation (i.e., the
sample based estimate of the standard deviation of
population), and
the
n is the size (number of observations) of the sample.
Confidence intervals
Because we know that the sampling distribution is normal,
we know that 95.45% of samples will fall within two
standard errors.
95% of samples fall within 1.96
standard errors.
99% of samples fall within
2.58 standard errors.
www.slideshare.com
www.google.com
Fundamentals of statistics , s.c.gupta
Statistics for mba , t.r.jain
www.investopedia.com

Sampling

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sampling

Uploaded by

Copyright:

Available Formats

Submitted by:

The method of selecting out of a given

Reduces time & cost of researcher (e.g.

Saving of time & money

More reliable results

More scientific Methods

Need of specified knowledge

Probability sampling methods

Simple random sampling

Non-profitability sampling methods

Every subset of a specified size n from the population has an equal

1. Get a list or sampling frame

The population is divided into two or more groups called strata,

Select a random number, which will be known as k.

Every kth member ( for example: every 10th person) is

The population is divided into subgroups (clusters)

Selection of whichever individuals are

In it selection criteria based on

In it selection of additional respondents

Determine what the population looks

Sampling errors are those which arise due to the method of

Population - Entire group of items/individuals we want

Sample - The part of the population we actually examine in

A parameter is a number that describes the population. It is

A statistic is a number that describes a sample. Its value is

We often use statistics to estimate the unknown parameter

Statistical inference draws conclusions about a population on the basis

It also provides us with a statement of how much confidence we can

We are in many cases interested in the mean value a variable takes in

Individual scores are random draws from a population

The sample mean is a guess about the true population mean

But how accurate (or efficient) is the sample mean?

I want to estimate the SD of the mean of n observations, i.e., how

But I only get to observe one sample

Imagine that you could draw a sample and calculate a mean or

What would that distribution of this statistic look like?

Youre conceptualizing a sampling distribution.

What is its expected value and standard deviation?

is a distribution of sample statistics (means, medians, etc.)

is a theoretical distribution that describes all possible means,

can be visualized using simulations, but must be imagined when

They are approximately normal

They are centered at of the

Their standard deviation equals the

Sampling Distribution of Sample Means if n = 25

Sampling Distribution of Sample Means if n = 100

Sampling Distribution of Sample Means if n = 400

4.00 p < .0001

P-value= 0.05 level

What Z score in a normal distribution separates the most extreme 5%

n = 25; standard error of the mean = 3.00

X 100 1.96 3.00 94.12

X 100 1.96 3.00 105.88

What Z score in a normal distribution separates the most extreme 5%

n = 100, standard error of the Mean = 1.50

X 100 1.96 1.50 97.06

X 100 1.96 1.50 102.94

What Z score in a normal distribution separates the most extreme 5%

n = 400 , standard error of the Mean = 0.75

X 100 1.96 .75 98.53

X 100 1.96 .75 101.47

Deriving the standard error of