You are on page 1of 33

SAMPLING DISTRIBUTIONS

8/16/2018 1
Sampling Distributions
 Concept Of Sampling

 Special sampling methods

 Sampling Distributions

 Sampling Distributions of Mean from Normal


Population

 Sampling Distributions of Mean from Non Normal


Population- Central Limit Theorem

 Sampling distribution of the sample proportion


8/16/2018 2
LEARNING OBJECTIVES
After studying this module you should be able to:
 Take random samples from populations
 Differentiate between a population parameter

and a sample statistic.


 Explain common sample biases

 Identify special sampling methods

 The concept of the sampling distribution

 Derive sampling distributions of sample mean

 Apply the central limit theorem

 Derive sampling distributions of sample

proportion
8/16/2018 3
Sampling
• Statistical Inference:
On basis of sample
Predict and forecast
values of population
statistics derived
parameters... from limited and
Test hypotheses about incomplete sample
values of population information
parameters...
Make decisions...

Make On the basis of


generalizations observations of a
about the sample, a part of
characteristics of a a population
population...
8/16/2018 4
Sampling
 Population—consists of all items of interest
in a statistical problem.
 Population Parameter is unknown.

 Sample—a subset of the population.


 Sample Statistic is calculated from sample and
used to make inferences about the population.

 Bias—the tendency of a sample statistic to


systematically over- or underestimate a
population parameter.
Sampling & Why Sample?
 Sampling is a method of selecting units of analysis,
such as households, people, consumers, companies
etc. from a population of interest.

 By analyzing data collected from sample, inferences


about the population parameter can be drawn.

 Selecting a sample is less time-consuming than


selecting every item in the population (census).
Size of a Sample
• Number of units to be selected in a sample,

• Depends on several factors, like,

Population size
Heterogeneity in the Population’s concerned
characteristic
Accuracy and Reliability
Allocation of Resources
A Sampling Process Begins With A Sampling
Frame
 Sampling frame is a listing of items that make
up the population
 List of all units with their identifications
 Frames are data sources such as population
lists (like trade association lists), directories
etc.
 Inaccurate or biased results can occur if a
frame excludes certain portions of the
population
 Using different frames to generate data can
lead to dissimilar conclusions
TYPES OF SAMPLING
Sampling

Probability Sampling Non-probability Sampling


(Random Sampling) (Non Random Sampling)

Simple Stratified
Systematic Cluster Convenience Expert Quota
Random Random
Sampling Sampling Sampling Sampling Sampling
Sampling Sampling

Nonrandom Sampling - Every unit of the population does not


have the same probability of being included in the sample
Random sampling - Every unit of the population has the same
probability
8/16/2018 of being included in the sample. 9
Types of Samples:
Nonprobability Sample
 In a non probability sample, items included
are chosen without regard to their
probability of occurrence.
 In convenience sampling, items are
selected based only on the fact that they
are easy, inexpensive, or convenient to
sample.
 In an expert sampling, the opinions of
pre-selected experts in the subject matter
is considered.
Types of Samples:
Probability Sample
 In a probability sampling, items in the
sample are chosen on the basis of known
probabilities.

Probability
Sampling

Simple Random Systematic Stratified Cluster


Probability Sample:
Simple Random Sampling
 Every individual or item from the frame has an
equal chance of being selected

 Selection may be with replacement (selected


individual is returned to frame for possible
reselection) or without replacement (selected
individual isn’t returned to the frame).

 Samples obtained from table of random numbers or


computer random number generators.

 Most statistical methods presume simple random


samples
Selecting a Simple Random Sample Using A
Random Number Table
Sampling Frame For Portion Of A Random Number Table
Population With 850 49280 88924 35779 00283 81163 07275
Items 11100 02340 12860 74697 96644 89439

Item Name Item # 09893 23997 20048 49420 88872 08401


Bev R. 001
The First 5 Items in a simple random
Ulan X. 002 sample
. .
Item # 492
. .
. . Item # 808
. . Item # 892 -- does not exist so ignore
Joann P. 849 Item # 435
Paul F. 850
Item # 779
Item # 002
Simple Random Sample: Using Tables of
Random Numbers
A population consists of 845 employees of Nitra
Industries. A sample of 52 employees is to be
selected from that population.
A more convenient method of selecting a random
sample is to use the identification number of each
employee and a table of random numbers
Sampling
 Example: In 1961, students invested 24 hours per
week in their academic pursuits, whereas today’s
students study an average of 14 hours per week.
 A dean at a large university in West Bengal wonders if
this trend is reflective of the students at her university.
The university has 20,000 students and the dean would
like a sample of 100. Use Excel to draw a simple
random sample of 100 students.
 In Excel, choose
Formulas > Insert function >
RANDBETWEEN and input
the values shown here.
Probability Sample: Systematic Sampling
 Units are drawn from the population at regular intervals
clearly defined.

 Start at a random point in the sampling frame, and from this


point, every kth value in the frame is selected to formulate the
sample.

 One of the easiest procedures to follow.

 The steps involved in constructing a systematic sampling :

 Compute K=(N/n) and take the integer value; K is called the


sampling interval.

 Select a random number between 1 and K.

 Starting with this number select every K th number until all


the n units are selected.
8/16/2018 16
SYSTEMATIC SAMPLING-EXAMPLE
 In a market survey, 5 households need to be
selected out of 50 households in a Block.
 The table containing all the households is serially
numbered from 1 to 50.
 Number of units in the Population N = 50
 Number of units in the Sample n = 5
 Sampling interval K = (N/n) = 50/5 = 10
 Select a random number between 1 and 10 (using
simple random sampling).
 Suppose the selected random number is 5.
 Starting with 5, select every 10th unit. The
selected units are highlighted
8/16/2018 17
SYSTEMATIC SAMPLING
 1, 2, 3, 4,5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50

 In systematic sampling procedure, it is necessary


that the units in the population are randomly
arranged on the basis of the measured
characteristics
Problem:
 if data are subject to any periodicity & the
sampling interval in syncopation with it
resulting non random sampling.
8/16/2018 18
Systematic Sampling
 Decide on sample size: n
 Divide frame of N individuals into groups of k
individuals: k=N/n
 Randomly select one individual from the 1st group
 Select every kth individual thereafter

First Group
N = 40
n=4
k = 10
Probability Sample: Stratified Sampling
 Population is partitioned into two or more subpopulation called
strata according to some relevant characteristics so that each
stratum is more or less homogeneous
 Randomly select observations from each stratum, which are
proportional to the stratum’s size.
 Used when population is heterogeneous.
 If proper stratification can be made so that the strata differ
from each other as much as possible but with homogeneity
within each stratum
 Give better estimates of the population characteristics than a
random sample of same size.
Advantages:
 Guarantees that the each population subdivision is
represented in the sample.
 Parameter estimates have greater precision than those
8/16/2018 estimated from simple random sampling 20
Probability Sample: Stratified Sampling
 Divide the population into mutually exclusive and collectively
exhaustive groups, called strata according to some common
characteristic
 A simple random sample is selected from each subgroup,
with sample sizes proportional to strata sizes
 Samples from subgroups are combined into one
 This is a common technique when sampling population of
voters, stratifying across racial or socio-economic lines.

Population
Divided
into 4 strata
Stratified Random Sampling
Need to study the advertising expenditures for the 352 largest
companies to determine whether firms with high returns on
equity (a measure of profitability) spent more of each sales
rupees on advertising than firms with a low return or deficit.
To make sure that the sample is a fair representation of the
352 companies, the companies are grouped on percent return
on equity and a sample proportional to the relative size of the
group is randomly selected.
Probability Sample:Cluster Sampling
 Population is divided into a number of clusters .

 Clusters are naturally occurring designations, such as,


city blocks, election districts, households, sales territories
etc.

 A random sample of clusters is selected from this


population of clusters

 Either all units within the randomly chosen clusters are


measured or further random sampling can be done in
each cluster

 When all units are measured in selected clusters ►


cluster sampling.

 When further sampling is done within each cluster by


adopting a simple random sampling or stratified random
sampling ► multistage sampling. 23
Probability Sample:Cluster Sampling

Advantages and disadvantages:


 Less expensive than other sampling methods.

 Less precision than simple random sampling


or stratified sampling.

 Useful when clusters occur naturally in the


population

8/16/2018 24
Probability Sample Cluster Sampling
 Divide population into mutually exclusive and collectively
exhaustive groups, called clusters, each representative of
the population
 A simple random sample of clusters is selected
 All items in the selected clusters can be used, or items can
be chosen from a cluster using another probability sampling
technique
 A common application of cluster sampling involves election
exit polls, where certain election districts are selected and
sampled .

Population
divided into
16 clusters. Randomly selected
clusters for sample
Sampling
 Stratified versus Cluster Sampling

 Stratified Sampling  Cluster Sampling

 Sample consists of  Sample consists of


elements from each elements from the
group. selected groups.

 Preferred when the  Preferred when


objective is to the objective is to
increase precision. reduce costs.
Comparing Sampling Methods
 Simple random sample and Systematic sample
 Simple to use
 May not be a good representation of the population’s
underlying characteristics
 Stratified sample
 Ensures representation of individuals across the entire
population
 Stratification is done to make strata homogeneous within &
different from other strata.
 Cluster sample
 Cluster should be heterogeneous within & different cluster
should be similar to each other.
 More cost effective
 Less efficient (need larger sample to acquire the same level
of precision)
Evaluating Survey Worthiness

 What is the purpose of the survey?


 Is the survey based on a probability
sample?
 Coverage error – appropriate frame?
 Non response error – follow up
 Measurement error – good questions elicit
good responses
 Sampling error – always exists
Sampling Error
Sampling error is the difference between a
sample statistic and its corresponding
population parameter.
Examples:

X 
s 
s 
2 2

p 
Types of Survey Errors
 Coverage error (Selection error)
 Exists if some groups are excluded from the frame
and have no chance of being selected
 Non response error
 People who do not respond may be different from
those who do respond
 Sampling error
 Variation from sample to sample will always exist
 Measurement error
 Due to weaknesses in question design, respondent
error, and interviewer’s effects on the respondent
(“Hawthorne effect”)
Types of Survey Errors

 Coverage error Excluded from


frame

 Non response error Follow up on


nonresponses

Random
 Sampling error differences
from sample to
sample
 Measurement error Bad or leading
question
Types of Survey Errors
 Classic Case of a “Bad” Sample: The Literary Digest
Debacle of 1936

 During the1936 presidential election, the Literary


Digest predicted a landslide victory for Alf Landon
over Franklin D. Roosevelt (FDR) with only a 1%
margin of error based on randomly sampling from
their own subscriber/ membership lists, etc with a
24% response rate.

 They were wrong! FDR won in a landslide election.


Types of Survey Errors
 Selection bias—a systematic exclusion of certain
groups from consideration for the sample.
 The Literary Digest committed selection bias by

excluding a large portion of the population (e.g.,


lower income voters).
 Nonresponse bias—a systematic difference in
preferences between respondents and non-
respondents to a survey or a poll.
 The Literary Digest had only a 24% response
rate. This indicates that only those who cared a
great deal about the election took the time to
respond to the survey. These respondents may
be a typical of the population as a whole.

You might also like