Professional Documents
Culture Documents
Sampling is the act, process, or technique of selecting a suitable sample, or a representative part of a
population for the purpose of determining parameters or characteristics of the whole population. The goal
is to have the smaller study group to resemble as closely as possible the larger group. Sampling permits
the researcher to work with a more manageable group size. The study’s findings can be generalized back
to the total population with inferential statistics.
Sampling is used for gathering data from a population. It is a statistical practice by which observations
are made upon certain individuals of a population so as to derive certain conclusions about the population.
A sample is a finite part of a statistical population whose properties are studied to gain information about
the whole (Webster, 1985). When dealing with people, it can be defined as a set of respondents (people)
selected from a larger population for the purpose of a survey.
2. Since in some cases, for example the population of a country, there are obvious practical and
economic reasons which will hinder the study of the whole population, a sample is used so that
the data gathered from the sample may be inferred to the population.
1. Reduced cost. If data are secured from only a small fraction of the aggregate, expenditures may
be expected to be smaller than if a complete census is attempted.
2. Greater speed. For the same reason, the data can be collected and summarized more quickly
with a sample than with a complete count. This may be a vital consideration when the
information is urgently needed.
3. Greater scope. In certain types of inquiry, highly trained personnel or specialized equipment,
limited in availability, must be used to obtain the data. A complete census may then be
impracticable: the choice lies between obtaining the information by sampling or not at all. Thus
surveys which rely on sampling have more scope and flexibility as to the types of information
that can be obtained. On the other hand, if information is wanted for many subdivisions or
segments of the population, it may be found that a complete enumeration offers the best solution.
4. Greater accuracy. Because personnel of higher quality can be employed and can be given
intensive training, a sample may actually produce more accurate results than the kind of complete
enumeration that it is feasible to take.
2. Sampling does not provide information for action with respect to individual account.
3. Sampling produces results containing errors of sampling. This is a disadvantage if the error
of sampling is too big for some purpose one has in mind.
For example, the sampling frame in a household survey may be the people listed in the telephone
directory.
In most cases, due to time and size constraint, a representative set of the population is taken for
observation. A sample is selected for data collection purposes from the sample frame (hence, from the
population)
For example,
1. The target population for a household survey may be the Mauritius adult population
2. A manufacturer needs to decide whether a batch of material from production is of high enough
quality to be released to the customer, or should be sentenced for scrap or rework due to poor
quality. In this case, the batch is the population.
2. Co relational Studies: The recommended number for relationship studies is also 30. Smaller
numbers make it difficult to obtain statistical significance.
3. Descriptive Studies: The number of participants in a descriptive study can vary significantly.
Usually the size of the population to be studied has more of an effect on the sample size than any
general sampling rule. Small populations require a larger percentage of the population to be
included in the study.
General Rules:
1. A smaller percentage is required for a larger population.
2. Studies using a population less than 100 should use the entire population.
5. Population of above 100000 would require only 384 in the sample population.
10 10 150 108
20 19 200 132
30 28 300 169
40 36 500 217
50 44 1000 278
60 52 2000 322
70 59 5000 357
80 66 10000 370
90 73 50000 381
Calculation of an appropriate sample size depends upon a number of factors unique to each survey and it
is down to us to make the decision regarding these factors. The three most important are:
The temptation is to say all should be as high as possible. The problem is that an increase in either
accuracy or confidence (or both) will always require a larger sample and higher budget. Therefore a
compromise must be reached and the degree of inaccuracy and confidence one is prepared to accept must
be worked out.
For example in a Market research project, values such as mean income and mean height etc are estimated.
For a mean
The required formula is: s = (z / e)2
Where:
s = the sample size
z = a number relating to the degree of confidence one wishes to have in the result. 95% confidence* is
most frequently used and accepted. The value of ‘z’ should be 2.58 for 99% confidence, 1.96 for 95%
confidence, 1.64 for 90% confidence and 1.28 for 80% confidence.
e = the error that can be accepted, measured as a proportion of the standard deviation (accuracy)
If mean income is being estimated and one wishes to know what sample size to aim for, in order that one
can be 95% confident in the result. Assuming that an error of 10% of the population standard deviation
can be accepted the following calculation can be used:
s = (1.96 / 0.1)2
Therefore s = 384.16
In other words, 385 people would need to be sampled to meet our criterion.
If the whole population had been interviewed, then the confidence level would have been 100%. But
since only a sample has been interviewed, one is less confident. As the sample size calculation has been
based on the 95% confidence level, this means that one can be confident that amongst the whole
population there is a 95% chance that the mean is inside the acceptable error limit. However there is of
course a 5% chance that the measure is outside this limit. If one wanted to be more confident, the sample
size calculation should have been based on a 99% confidence level and if a lower level of confidence can
be accepted, then the calculation can be based on the 90% confidence level.
1. The sources of sampling bias must be known and it must be identified how to avoid it.
2. It must be decided whether the bias is so severe that the results of the study will be
seriously affected
3. In the final report, awareness of bias, rationale for proceeding, and potential effects must
be documented.
5. Cost/operational concerns.
A probability sampling scheme is one in which every unit in the population has a chance (greater than
zero) of being selected in the sample, and this probability can be accurately determined. The combination
of these traits makes it possible to produce unbiased estimates of population totals, by weighting sampled
units according to their probability of selection.
Example: Placing names in a hat and drawing the sample is a method of using simple random
sampling.
4.1.1 Steps
The following steps are used to randomly select a sample.
1. Assuming that there is a population of 185 students and each student has been assigned a number
from 1 to 185. Suppose we wish to sample 5 students
2. Since the population consists of 185 students and 185 is a three digit number, the first three digits
of the numbers listed on the chart must be used.
3. We close our eyes and randomly point to a spot on the chart. For this example, 20631 in the first
column was selected.
4. That number is interpreted as 206 (first three digits). Since there is no member of the population
with that number, we go to the next number 899 (89990). Once again there is no one with that
number, so we continue at the top of the next column. As we work down the column, the first
number to match the population is 100 (actually 10005 on the chart). Student number 100 would
be in the sample. Continuing down the chart, the other four subjects in the sample would be
students 049, 082, 153, and 005.
=RAND()
Type that into a cell and it will produce a random number in that cell. Copy the formula throughout a
selection of cells and it will produce random numbers between 0 and 1.
Whatever range that is required can be obtained if the formula is modified. For example, if random
numbers from 1 to 250 are needed, the following formula could be entered:
=INT(250*RAND())+1
The INT eliminates the digits after the decimal, the 250* creates the range to be covered, and the +1
sets the lowest number in the range.
4.1.2 Example:
The University Of Mauritius has decided to offer 10 books to the MBA group B class. In order that the
books are given to 10 random students, a random sampling is carried out using Simple Random Sampling
of 10 students from the class. The names or roll nos, available on XL sheets are listed and a random
number. is allocated to all the students. The list is then sorted in ascending order and the fist 10 names
sorted out are provided with the books.
4.1.3 Advantages:
1. It is easy to conduct
2. strategy requires minimum knowledge of the population to be sampled
3. It minimises bias and simplifies analysis of results. In particular, the variance between individual
results within the sample is a good indicator of variance in the overall population, which makes it
relatively easy to estimate the accuracy of results.
4.1.4 Disadvantages
1. SRS can be vulnerable to sampling error because the randomness of the selection may result in a
sample that doesn't reflect the makeup of the population.
For instance, a simple random sample of ten people from a given country will on average
produce five men and five women, but any given trial is likely to over represent one sex and
under represent the other. Systematic and stratified techniques, discussed below, attempt to
overcome this problem by using information about the population to choose a more representative
sample.
2. SRS may also be cumbersome and tedious when sampling from an unusually large target
population. In some cases, investigators are interested in research questions specific to subgroups
of the population.
For example,
SYSTEMATIC SAMPLING
4.2.2 Steps:
The following steps are used to systematically select a sample using every (Kth) name.
5. The researcher next divides this sample into the total population producing the Kth number.
6. The researcher then selects a random starting point on the population list within the first Kth
number. For example, the population is 1,000 and we need a sample of 50. The Kth number is 20.
The researcher randomly selects a starting point (some number between 1 and 20).
7. The next member of the sample is chosen by adding 20 to the random starting point.
4.2.2Example:
Let's assume that we have a population that has N=100 people in it.
Now, to select the sample, we start with the 4th unit in the list and take every k-th unit (every 5th, because
k=5). We would be sampling units 4, 9, 14, 19, and so on to 100 and we would end up with 20 units in
our sample.
4.2.3 Advantages
2. It ensures that the individuals chosen are spread across the population.
4.2.4 Disadvantages
1. All members of the population do not have an equal chance of being selected.
2. The Kth person may be related to a periodical order in the population list, producing
unrepresentativeness in the sample.
3. It may prove to be costly and time consuming if samples are not conveniently located.
4. Bias can occur where there are recurring sets in the population.
A second method of modifying the random sampling process is called stratified sampling. In this case the
population is divided into subgroups chosen by the researcher. Stratified sampling can be proportional or
non-proportional. In proportional sampling the participants are chosen in proportion to the number in each
subgroup. Non proportional sampling occurs when the response weight of the subgroup is not a factor.
4.3.1 Steps:
1. The population is divided into non-overlapping groups N1, N2, N3, ... Ni, such that N1 + N2 + N3
+ ... + Ni = N where N is population size.
4.3.2 Example
An example might be taken from University of Mauritius. The opinion of students in the Faculty are to
be taken in connection with their grievances. Suppose from 500 students, there are 300 males and 200
females and out of these, there are 60 male part timers and 50 female part timers.
1. We are required to take a sample of 100 student, stratified according to the above categories.
2. The first step is to find the total number of students (500) and calculate the percentage in each
group.
4.0 Therefore the above numbers of people are randomly chosen within their strata.
The state superintendent of schools wants to determine if geographic location has a significant effect on
teacher support merit pay plan.
Steps:
3. Proportional sampling is used to select study participants: 20% of population in each area.
6. It is efficient
7. Sampling equal numbers from strata varying widely in size may be used to equate the statistical
power of tests of differences between strata.
4.3.4 Disadvantages
6. It can be expensive
The sample is generally done by first sampling at the higher level(s) e.g. randomly sampled countries,
then sampling from subsequent levels in turn e.g. within the selected countries sample counties, then
within these postcodes, the within these households, until the final stage is reached, at which point the
sampling is done in a simple random manner e.g. sampling people within the selected households. The
‘levels’ in question, are defined by subgroups into which it is appropriate to subdivide your population.
4.4.1Steps:
1. The population is identified and defined.
4. All clusters (or a list is obtained) that make up the population of clusters is listed.
6. The number of clusters needed is determined by dividing the sample size by the estimated size of
a cluster.
7. The needed number of clusters is randomly selected by using a table of random numbers.
8. All population members in each selected cluster are included in the study.
Some of the problems that random sampling would create here are:
1. It is difficult to administer since each class would have only a few students in the sample.
2. It is difficult to set up a control and experimental group study since some students would be in the
same class.
3. Increased cost and time to train the participants in all 100 classrooms.
Steps:
1. The cluster to be used must be determined. The logical cluster to use in this study would be each
of the 100 individual classrooms.
2. The 100 classrooms are determined and the number of subjects needed is determined. In this case
30 classrooms have been chosen.
3. The 30 chosen classrooms are determined and using random selection, the 15 classes to be chosen
for each of the experimental and control groups are determined.
4.4.4Disadvantages
Fewer sampling points make it less like that the sample is representative
NON-PROBABILITY SAMPLING
Nonprobability sampling is any sampling method where some elements of the population have
no chance of selection (these are sometimes referred to as 'out of coverage'/'undercovered'), or
where the probability of selection can't be accurately determined. It involves the selection of
elements based on assumptions regarding the population of interest, which forms the criteria for
selection. Hence, because the selection of elements is nonrandom, nonprobability sampling does
not allow the estimation of sampling errors. These conditions place limits on how much
information a sample can provide about the population. Information about the relationship
between sample and population is limited, making it difficult to extrapolate from the sample to
the population.
It is the process of including whoever happens to be available at the time. It is also called as accidental or
haphazard sampling
5.1.1Steps:
The researcher just interview people at any places as they walk by for example on the street. This is easy
because he just chooses it, without any random mechanism. He chooses the people that walk by.
Sometimes the people could ignore him so it all depends what he is surveying.
5.1.2Examples:
• In a class of 50 students, the teacher chooses the first 5 students who raise
their hands or who answers a question right.
5.1.3Advantages:
1. Convenience sampling is often a preferred option to other methods of
sampling because it allows a researcher to pilot-test an experiment with
minimal resources and time.
3. It is perhaps the best way of getting some basic information quickly and
efficiently
5.1.4Disadvantages:
1. The sample is not an accurate representation of the population
It is also called “judgement” sampling. A purposive sampling is a non-random sampling in which the
selection of the sample is based on person expertise about the population. As the purposive sampling is
not based on the probability theory therefore, no objective method is used for measuring the reliability of
the sample results. This technique being unscientific always involves the liking and disliking of the
enumerators. This method is useful only when the sample drawn is small provided the selection of the
sample is representative and the investigator is thoroughly skilled and has experience in the field of
inquiry and known the drawbacks of the deliberate selection.
5.2.1Steps:
When taking the sample, reject people who do not fit a particular profile.
5.2.2Example
A researcher wants to get opinions from non-working mothers. They go around an area knocking on
doors during the day when children are likely to be at school. They ask to speak to the 'woman of the
house. Their first questions are then about whether there are children and whether the woman has a day
job.
5.2.3Advantages:
5.2.4Disadvantage
Potential for inaccuracy in the researcher’s criteria and resulting sample selections
In a Market Research context, the most frequently-adopted form of non-probability sampling is known as
quota sampling? In some ways this is similar to cluster sampling in that it requires the definition of key
subgroups. The main difference lies in the fact that quotas (i.e. the amount of people to be surveyed)
within subgroups are set beforehand (e.g. 25% 16-24 yr olds, 30% 25-34 yr olds, 20% 35-55 yr olds, and
25% 56+ yr olds) usually proportions are set to match known population distributions. Interviewers then
select respondents according to these criteria rather than at random.
5.3.1Steps:
Like stratified sampling, the researcher first identifies the stratums and their proportions as they are
represented in the population. Then convenience or judgment sampling is used to select the required
number of subjects from each stratum. This differs from stratified sampling, where the stratums are filled
by random sampling.
5.3.2Example:
The student council at UOM wants to gauge student opinion on the quality of their extracurricular
activities. They decide to survey 100 of 1,000 students using the grade levels (7 to 12) as the sub-
population.
The table below gives the number of students in each grade level.
The student council wants to make sure that the percentage of students in each grade level is reflected in
the sample. The formula is:
Since 15% of the school population is in Grade 10, 15% of the sample should contain Grade 10 students.
Therefore, use the following formula to calculate the number of Grade 10 students that should be included
in the sample:
5.3.3Advantages:
5.3.4Disadvantages:
1. People who are less accessible (more difficult to contact, more reluctant to participate) are
underrepresented.
2. The subjective nature of this selection means that only about a proportion of the population has a
chance of being selected in a typical quota sampling strategy.
3. It does not meet the basic requirement of randomness.
4. Some units may have no chance of selection or the chance of selection may
be unknown. Therefore, the sample may be biased.
SNOWBALL SAMPLING
In snowball sampling, someone who meets the criteria for inclusion in the study is identified. The person
is then asked to recommend others who they may know who also meet the criteria. Although this method
would hardly lead to representative samples, there are times when it may be the best method available.
5.4.1 Steps:
2. Ask them to refer you other people who fit your study requirements, then
follow up with these new people.
3. Repeat this method of requesting referrals until you have studied enough
people.
5.4.2Example:
If the homeless are being studied, it is not likely to find good lists of homeless people within a specific
geographical area. However, if we go to that area and identify one or two, we may find that they know
very well who the other homeless people in their vicinity are and how we can find them.
5.4.3 Advantages
Snowball sampling is especially useful when we are trying to reach populations that are
inaccessible or hard to find.
5.4.4Disadvantages:
2. The way that the sample is chosen by target people makes it liable to various
forms of bias. People tend to associate not only with people with the same
study selection characteristic but also with other characteristics.
Heterogeneity Sampling
Homogeneity Sampling
If all opinions or views need to be included, and there is no concern for representing these views
proportionately then heterogeneity sampling is performed. Another term for this is sampling for diversity.
In many brainstorming or nominal group processes (including concept mapping), some form of
heterogeneity sampling is used because the primary interest is in getting broad spectrum of ideas, not
identifying the "average" or "modal instance" ones. In effect, what must be sampled is ideas and not
people. Here the universe is made up of all possible ideas relevant to some topic and a sampling of this
population is needed, not a sample of the people who have the ideas. Clearly, in order to get all of the
ideas, and especially the "outlier" or unusual ones, a broad and diverse range of participants must be
included. Heterogeneity sampling is, in this sense, almost the opposite of modal instance sampling.
An essential prerequisite is that any sample must be selected in such a way as to be representative of the
population from which it has been drawn. The fundamental consideration is that any sample should be a
random sample, i.e. every member of the population should have an equal chance of being selected.
However a representative sample does not mean that it is an exact replica, in miniature, of the population
parameters. The results are subject to sampling error.
The percentage of people who respond to your survey is considered the response rate. A high survey
response rate helps to ensure that the survey results are representative of the survey population.
Sufficient response rates are important for surveys. A survey that collects very little data may not contain
substantial information. In order to collect successful responses, researchers must take into consideration
the audience, the quantity of online surveys in circulation, and the potential for surveys reported as spam.
These factors may result in lower respondent interest and acceptance of survey invitations. But there are
ways to increase response rates!
Low response rates are a continuing problem for survey organizations. Some people simply refuse to
participate in surveys, while others, for a wide range of reasons, cannot participate. Still, a well-
designed survey, coupled with incentives and techniques to elicit response, can help guarantee a healthy
response rate.
There are many reasons why people might choose not to respond to a survey. Sometimes time is a
factor. People may feel they can't spare the time to participate in a survey. Others may see a survey as a
nuisance, particularly telephone and mail surveys. However, some factors that can cause non-response
lie in the hands of the surveyors themselves, and can thus be avoided. The following list includes some
of the pitfalls that can lead to non-response:
• If potential respondents have trouble understanding the questions, the chance that they
will choose not to participate increases. Survey questions must be clear and concise.
• The survey format must be unambiguous and consistent. Question formats should also
remain consistent and not jump randomly from type to type (i.e. multiple choice to short answer
and back again). Instructions should be as explicit as possible.
• People are much more likely to respond to a nicely designed survey. A form that looks
unprofessional or haphazardly constructed will undoubtedly lead to a lower response rate. Web
surveys that require too much scrolling or contain too many pages can also inhibit response.
• Telephone Surveys can occur any time during the day, but the incredible growth of the
telemarketing industry has led many people to screen their calls, especially during the dinner
hour. If telephone interviewers identify themselves and their purpose up front, instances where
people assume they are telemarketers and screen them out can be minimized.
Response Issues
Not only do survey researchers have to be concerned about non response rate errors, but they also have to
be concerned about the following potential response rate errors:
• Response bias occurs when respondents deliberately falsify their responses. This error greatly
jeopardizes the validity of a survey's measurements.
• Response order bias occurs when a respondent loses track of all options and picks one that
comes easily to mind rather than the most accurate.
• Response set bias occurs when respondents do not consider each question and just answer all the
questions with the same response. For example, they answer "disagree" or "no" to all questions.
These response errors can seriously distort a survey's results. Unfortunately, response bias is difficult to
eliminate; even if the same respondent is questioned repeatedly, he or she may continue to falsify
responses. Response order bias and response set errors, however, can be reduced through careful
development of the survey questionnaire.
• Incentives are perhaps the most effective method to ensure participation. Survey
organizations use many kinds of incentives to elicit response, such as offering to share the
survey's findings or awarding a certain number of 'points' for each survey taken that can then be
redeemed for prizes. Some survey organizations enter respondents in a sweepstakes or even pay
a modest stipend for participation.
• Although answering machines are generally viewed as a problem, they can also be used
to a survey organization's advantage. A simple message requesting a call back can be very
effective, especially if the organization uses an 800 number.
• Postcards or e-mails announcing upcoming surveys have been shown to increase
response.
• Successful survey organizations always follow up the initial invitation with a reminder to
those that have not yet responded.
• Establishing legitimacy can help convince potential respondents to participate in a
survey. A good survey tells potential respondents, who is conducting the survey and what
credentials they hold. It also outlines procedures for asking questions and providing feedback.
• Surveying employees is a great way to gauge both opinion and workplace efficiency, but
these surveys only work if enough employees participate. Offering employees time to fill out a
survey not only ensures participation, it also sends a positive message that their opinions are
valued, leading to honest, more useful responses.
PART 2
Discuss how population parameters(mean, variance and proportion)are estimated from sample
parameters
E S T IMA T IO N OF P O P U L A T IO N P A R A ME TE R S
Every member of a population cannot be examined so we use the data from a sample, taken from the
same population, to estimate some measure, such as the mean, of the population itself.
The sample will provide us with the best estimate of the exact 'truth' about the population. The method of
sampling depends on the data available but the ideal method, as every member of the population has an
equal chance of being selected, is random sampling.
We estimate limits within which we are expect the 'truth' about the population to lie and state how
confident we are about this estimation.
b) The best estimate of the unknown population mean, µ , is the sample mean, x = ∑x .
n
This estimate of µ is often written µ
and referred to as 'mu hat'.
µ (mu) is the symbol for the population mean
d) The best estimate of the unknown population standard deviation, σ , is the sample standard deviation s,
where:
∑ ( x − x)
2
∑ ( x − x)
2
Often it is more useful to quote two limits between which the parameter is expected to lie, together with
the probability of it lying in that range.
The limits are called the confidence limits and the interval between them the confidence interval.
e.g. We are 95% confident that the mean male height lies between 5' 9" and 5' 11".
If the sample size is small, (n < 30), and the population standard deviation is unknown, then the t-
tables are used.
These give a wider interval and so compensates for the probable error in estimating the value of the
population standard deviation from the sample standard deviation.
(If the sample size is large either table gives a similar result.)
s
Confidence Interval: μ = x ± t where t is from the t-table with (n-1) degrees of freedom
n
In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter.
Instead of estimating the parameter by a single value, an interval likely to include the parameter is given.
Thus, confidence intervals are used to indicate the reliability of an estimate. How likely the interval is to
contain the parameter is determined by the confidence level or confidence coefficient. Increasing the
desired confidence level will widen the confidence interval.
A confidence interval is always qualified by a particular confidence level, usually expressed as a
percentage; thus one speaks of a "95% confidence interval". The end points of the confidence interval are
referred to as confidence limits. For a given estimation procedure in a given situation, the higher the
confidence level, the wider the confidence interval will be.
The calculation of a confidence interval generally requires assumptions about the nature of the estimation
process – it is primarily a parametric method – for example, it may depend on an assumption that the
distribution of the population from which the sample came is normal. As such, confidence intervals as
discussed below are not robust statistics, though modifications can be made to add robustness – see robust
confidence intervals.
The purpose of sampling is to draw inferences about a population parameter on the basis of sample
information.
Point estimators
A sample mean derived from a process of random sampling provides a good estimator of the population
mean in the sense that it is one that is near to the true population mean. A single sample mean may also be
regarded as a good estimator as it provides an unbiased estimate of the population mean. The probability
of a sample mean selected at random exceeding by certain amounts is exactly equal to the
We can say that µ=X where the hat (^) on µ indicates that it is an estimate of µ, the unknown population
parameter. Thus the sample mean X may be used as an estimator-an unbiased estimator- of the population
mean, µ.
Since the value of the estimator, X, computed from a single sample is a single value, it is referred to as a
point estimate of the unknown population mean because it represents a single point on the scale of
possible values.
Interval estimators
The interval containing a population parameter is established by calculating that statistic from
values measured on a random sample taken from the population and by applying the knowledge
(derived from probability theory) of the fidelity with which the properties of a sample represent
those of the entire population.
The probability tells what percentage of the time the assignment of the interval will be correct but not
what the chances are that it is true for any given sample. Of the intervals computed from many samples, a
certain percentage will contain the true value of the parameter being sought
For example,
Suppose we want to estimate the mean summer income of a class of business students.
Qualities of Estimators
Qualities desirable in estimators include unbiasedness, consistency, and relative efficiency:
• An unbiased estimator is said to be consistent if the difference between the estimator and the
parameter grows smaller as the sample size grows larger.
• If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to
be relatively efficient.
Unbiasedness
E.g. the sample mean is an unbiased estimator of the population mean , since:
E( )=
Consistency
An unbiased estimator is said to be consistent if the difference between the estimator and the
parameter grows smaller as the sample size grows larger.
V( ) is
If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to
be relatively efficient.
E.g. both the the sample median and sample mean are unbiased estimators of the population
mean, however, the sample median has a greater variance than the sample mean, so we choose
As sample size increases, the sample means cluster more and more around the true population mean. Thus
the variance and standard deviation of the sampling distribution decline as sample size is increased. This
standard deviation is formally referred to as the standard error of the mean
σx
The standard error declines as the sample size is increased, not proportionately- it declines according to
√n, not n.
In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter.
Instead of estimating the parameter by a single value, an interval likely to include the parameter is given.
Thus, confidence intervals are used to indicate the reliability of an estimate. How likely the interval is to
contain the parameter is determined by the confidence level or confidence coefficient. Increasing the
desired confidence level will widen the confidence interval.
The calculation of a confidence interval generally requires assumptions about the nature of the estimation
process – it is primarily a parametric method – for example, it may depend on an assumption that the
distribution of the population from which the sample came is normal. As such, confidence intervals as
discussed below are not robust statistics, though modifications can be made to add robustness – see robust
confidence intervals.
REFERENCE
http://writing.colostate.edu/guides/research/survey/com2d4.cfm
http://davidmlane.com/hyperstat/A12977.html
http://www.britannica.com/EBchecked/topic/466339/point-estimation
http://onlinestatbook.com/chapter8/mean.html
Division of Instructional Innovation and Assessment, The University of Texas at Austin. “Guidelines for
Maximizing Response Rates.” Instructional Assessment Resources. 2007.
http://www.utexas.edu/academic/diia/assessment/iar/teaching/gather/method/survey-Response.php
http://en.wikipedia.org/wiki/Sampling_(statistics)
http://www.gap-system.org/~history/Extras/Cochran_sampling_intro.html
http://www.marketresearchworld.net/index.php?
option=com_content&task=view&id=23&Itemid=1&limit=1&limitstart=1
http://www.socialresearchmethods.net/kb/sampterm.php
http://www.socialresearchmethods.net/tutorial/Mugo/tutorial.h