Professional Documents
Culture Documents
Roll No:
Learning Centre:
Assignment Set- 1
Q1. Why it is necessary to summarise data? Explain the approaches
available to summarize the data distributions?
Answer:
If the number of values is finite, then the data is said to be discrete data. The
number of occurrences of each value of the data set is called frequency of
that value. A systematic presentation of the values taken by variable
together with corresponding frequencies is called a frequency distribution of
the variable.
Median: Median of a set of values is the value which is the middle most
value when they are arranged in the ascending order of magnitude. Median is
denoted by ‘M’.
Mode: Mode is the value which has the highest frequency and is denoted by
Z.
Modal value is most useful for business people. For example, shoe and
readymade garment manufacturers will like to know the modal size of the
people to plan their operations. For discrete data with or without frequency, it
is that value corresponding to highest frequency.
Appropriate Situations for the use of Various Averages
Positional Averages
Median is the mid-value of series of data. It divides the distribution into two
equal portions. Similarly, we can divide a given distribution into four, ten or
hundred or any other number of equal portions.
Answer:
d) Facilitate comparison
Tabulation depicts the data and their significance at first in the form of
figures, which cannot be understood when the same data are in a narrative
form.
Marital
Sex Educated Non-Educated
Status
Male
Married
Femal
e
Male
Unmarried
Femal
e
Q3. Give a brief note of the measures of central tendency together
with their merits & Demerits. Which is the best measure of central
tendency and why?
Answer:
Arithmetic Mean
The Arithmetic mean or simply the mean is the best known easily understood
and most frequently used average in any statistical analysis. It is defined as
the sum of all the values in the data.
Mode: The word mode seems to have been derived French 'a la mode' which
means 'that which is in fashion'. It is defined as the value in the data which
occurs most frequently. In other words, it is the most frequently occurring
value in the data. For ungrouped data we form the array and then fix the
mode as the value which occurs most frequently. If all the values are distinct
from each other, mode cannot be fixed. For a frequency distribution with just
one highest frequency such data are called unimodal or two highest
frequencies [such data are called bimodal],mode is found by using the
formula,
Mode=l+cf2/f1+f2
Where l is the lower limit of the model class, c is its class interval f1 is the
frequency preceding the highest frequency and f2 is the frequency
succeeding the highest frequency.
Mean: The mean is the most commonly and frequently used average. It is a
simple average, understandable even to a layman. It is based on all the
values in a given data. It is easy to calculate and is basic to the calculation of
further statistical measures of dispersion, correlation etc. Of all the averages,
it is the most stable one. However it has some demerits. It gives undue
weightages to extreme value. In other words it is greatly influenced by
extreme values. Moreover; it cannot be calculated for data with open - ended
classes at the extreme. It cannot be fixed graphically unlike the median or
the mode. It is the most useful average of analysis when the analysis is made
with full reference to the nature of individual values of the data. In spite of a
few shortcomings; it is the most satisfactory average.
Answer:
• = 0.04/5 = 0.008
Answer:
SD1 = 0.01kg
SD2 = 0.21kg
Z = 1.96
1.96 = [(X-X2)/SD1]/sqrt N
Q6. Find the probability that at most 5 defective bolts will be found
in a box of 200 bolts. If it is known that 2 per cent of such bolts are
expected to be defective. (You may take the distribution to be
Poisson; e-4= 0.0183).
Answer:
Poisson distribution
Given the mean number of successes (μ) that occur in a specified region, we
can compute the Poisson probability based on the following formula:
Poisson Formula. Suppose we conduct a Poisson experiment, in which the
average number of successes within a given region is μ. Then, the Poisson
probability is:
where x is the actual number of successes that result from the experiment,
and e is approximately equal to 2.71828.
M=5
PX = 0.0183*4/5
=0.01464
Thus, the probability that at most 5 defective bolts will be found in a box of
200 bolts is 0.01464
Assignment Set- 2
Q1. What do you mean by Statistical Survey? Differentiate between
“Questionnaire” and “Schedule”.
Answer:
• Planning and
• Execution.
The figure below shows the two broad stages of Statistical survey.
This method is generally adopted by research workers and other official and
non-official agencies. This method is used to cover large areas of
investigation. It is more economical and free from investigator’s bias.
However, it results in many “non-response” situations. The respondent may
be illiterate. The respondent may also provide wrong information due to
wrong interpretation of questions.
The respondent should not take much time in completing the questionnaire.
It should be small and not lengthy.
· The task of completion of questionnaire should not have much writing work.
· The completed questionnaire should be kept confidential and used only for
the purpose of the survey as mentioned in the investigation.
There are different types of questions that can be used in the questionnaire.
A questionnaire can have Contingency questions, Matrix questions, Closed
ended questions and Open ended questions. Let’s have a look at each one in
detail
Matrix questions are questions which are placed one under the other,
forming a matrix. The response categories are placed along the top and a list
of questions are placed down the side. This is used to efficiently occupy page
space and respondents’ time.
Closed ended questions are those where the respondents’ answers are
limited to a fixed set of responses. Usually scales are closed ended.
Yes/no questions – here the respondents answer with “yes” or “no”. Some
of the examples are:
Multiple choices – here the respondents have several options from which to
choose. For example:
Open ended questions are those questions for which the respondent
supplies their own answer without any fixed set of possible responses.
Examples of types of open ended questions include:
Items Expenditure
Food 4300
Clothing 1200
Education 700
Rent 2000
Others 600
Answer:
2000
Food
Clothing
600 Education
Others
4300 Rent
700
1200
Q3. Average weight of 100 screws in box „A‟ is 10.4 gms. It is mixed
with 150 screws of box „B‟. Average weight of mixed screws is 10.9
gms. Find the average weight of screws of box „B‟.
Answer:
[XAB] = NA XA + NB XB
N A + NB
100 + 150
XB = 11.23 gms.
Managers very often come across with situations where they have to take
decisions about implementing either course of action A or course of action B
or course of action C. Sometimes, they have to take decisions regarding the
implementation of both A and B.
For Example: A Sales manager may like to know the probability that he will
exceed the target for product A or product B.sometimes,he would like to
know the probability that the sales of product A and B will exceed the
target.the first type of probability is answered by addition rule.the second
type of probability is answered by multiplication rule.
Addition rule:
i) If ‘A’ and ‘B’ are any two events then the probability of the occurrence
of either ‘A’ or ‘B’ is given by:
ii) If ‘A’ and ‘B’ are two mutually exclusive events then the probability of
occurrence of either A or B is given by:
iii) If A, B and C are any three events then the probability of occurrence of
either A or B or C is given by:
Multiplication rule :
If ‘A’ and ‘B’ are two independent events then the probability of occurrence of
‘A’ and ‘B’ is given by:
Thus the conditional probability of occurrence of an event ‘A’ given that the
event ‘B’ has already occurred is denoted by P (A / B). Here, ‘A’ and ‘B’ are
dependent events. Therefore, we have the following rules.
If ‘A’ and ‘B’ are dependent events, then the probability of occurrence of ‘A
and B’ is given by:
It follows that:
For any bivariate distribution, there exists two marginal distributions and
‘m + n’ conditional distributions, where ‘m’ and ‘n’ are the number of
classifications/characteristics studied on two variables.
Q5. (a) What is meant by “Hypothesis Testing”? Give Examples
Answer:
5(a) In hypothesis testing, we must state the assumed or hypothesised value
of the population parameter before we begin sampling. The assumption we
wish to test is called the null hypothesis and is symbolised by ’Ho’.
The term ‘null hypothesis’ arises from earlier agricultural and medical
applications of statistics. In order to test the effectiveness of a new fertilizer
or drug, the tested hypothesis (the null hypothesis) was that it had no effect,
that is, there was no difference between treated and untreated samples. If we
use a hypothesised value of a population mean in a problem, we would
represent it symbolically as ‘µ H0’. This is read – ‘The hypothesised value of
the population mean’.
If our sample results fail to support the null hypothesis, we must conclude
that something else is true. Whenever we reject the hypothesis, the
conclusion we do accept is called the alternative hypothesis and is
symbolised H1 (“H sub-one”).
For the null hypothesis H0: µ = 200, we will consider three alternative
hypothesis as:
Example
The next step after stating the null and alternative hypotheses is to decide
what criterion to be used for deciding whether to accept or reject the null
hypothesis. If we assume the hypothesis is correct, then the significance level
will indicate the percentage of sample means that is outside certain limits (In
estimation, the confidence level indicates the percentage of sample means
that falls within the defined confidence limits).
Type II error:
A 6 50 10 56
B 2 100 2 120
C 4 60 6 60
D 10 30 12 24
E 8 40 12 36
Answer:
Σ P0q0
1360
Σ P0q1
1344
Or
L+P
2
Σ P0q0 Σ P0q1
1360 1344