IGNOU MBA MS-08 Solved Assignment 2011

Course Code : MS 08

Course Title : Quantitative Analysis for Managerial Applications

Assignment No. : 08/TMA-1/SEM-I/2011

Coverage : All Blocks

Note: Answer all the questions and send them to the Coordinator of the Study Centre you are
attached with.

1. Calculate the mean, median and mode from the following data relating to production of a steel
mill for 60 days

Production (in tons per

21-22 23-24 25-26 27-28 29-30

Number of days 7 13 22 10 8

Solution :

class No. of Mid value d=(x-A)/h d2 fd fd2 c.f.

days(f) (x)
20.5-22.5 7 21.5 -2 4 -14 28 7
22.5-24.5 13 23.5 -1 1 -13 13 20
24.5-26.5 22 25.5 = A 0 0 0 0 42
26.5-28.5 10 27.5 1 1 10 10 52
28.5-30.5 8 29.5 2 4 16 32 60
∑f=60 ∑fd= -1 ∑fd2= 83

a) Mean = A + [ (∑fd/∑f) × h]

Where A = assumed mean

h = class size

= 25.5 + [(-1/60) × 2]

= 25.467 (approx)

b) Median (Q2) = l1 + [{(N/2- p.c.f)/f} × l2 – l1

Median class = 24.5 – 26.5

Median = 24.5 + [{(60/2 – 20) /22} × 2]

= 25.409 (approx.)

c) Mode = l + [ { (f-f1)/(2f – f2 – f1) }× h]

Modal class = 24.5-26.5

Mode = 24.5 + [{ (22-13)/(2×22-13-10) } ×2]

= 25.357 (approx.)

2. A restaurant is experiencing discontentment among its customers. It analyses that there are three
factors responsible viz. food quality, service quality and interior décor. By conducting an analysis, it
assesses the probabilities of discontentment with the three factors as 0.40, 0.35 and 0.25 respectively.
By conducting a survey among the customers, it also evaluated the probabilities of a customer going
away discontented on account of these factors as 0.6, 0.8 and 0.5, respectively. With this information,
the restaurant wants to know that, if a customer is discontented, what are the probabilities that it is so
due to food, service or interior décor?

Solution : Let

“A” be the case of discontentment due to food.

“B” be the case of discontentment due to service.

& “C” be the case of discontentment due to interior décor.

Now, it’s given that

P(A) = 0.40

P(B) = 0.35

P( C ) = 0.25

Let the Probability of a CUSTOMER going away discontented on account be representated as “E”.

Then, it’s given that,

P(E/A) = 0.6 ( customer going away discontented on due to food)

P(E/B) = 0.8 ( “ “ “ “ “ “ “ due to service)

P(E/C) = 0.5 (“ “ ““ ““ ““ due to interior décor)

Now, Probability of a customer discontented ,

due to FOOD is given by P(A/E)

due to SERVICE P(B/E)


According to Baye’s theorem,

a) P(A/E) = {P(E/A) × P(A)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.6 × 0.40} ÷ {0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.372 (approx.)

b) P(B/E) = {P(E/B) × P(B)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.8×0.35}÷ {0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.434 (approx.)

c) P(C/E) = {P(E/C) × P(C)} ÷ {P(E/A) × P(A) + P(E/B) × P(B) + P(E/C) × P(C )}

= {0.5×0.25}÷{0.6 × 0.40 + 0.8 × 0.35 + 0.5 × 0.25}

= 0.194 (approx.)

3. The monthly incomes of a group of 10,000 persons were found to be normally distributed with mean
equal to 15,000 and standard deviation equal to 1000. What is the lowest income among the
richest 250 persons?

Solution : Need to be checked……….

4. Write short notes on the following:

a. Test of goodness of fit
b. Critical Region of a test
c. Exponential Smoothing Method
Solution :

a. Test of goodness of fit

he goodness of fit of a statistical model describes how well it fits a set of observations. Measures
of goodness of fit typically summarize the discrepancy between observed values and the values
expected under the model in question. Such measures can be used in statistical hypothesis testing,
e.g. to test for normality of residuals, to test whether two samples are drawn from identical
distributions or whether outcome frequencies follow a specified distribution In the analysis of
variance, one of the components into which the variance is partitioned may be a lack-of-fit sum of

Fit of distributions

In assessing whether a given distribution is suited to a data-set, the following tests and their underlying
measures of fit can be used:

 Kolmogorov–Smirnov test;
 Cramér–von-Mises criterion;
 Anderson–Darling test.
Regression analysis

In regression analysis, the following topics relate to goodness of fit:

 Coefficient of determination (The R squared measure of goodness of fit);

 Lack-of-fit sum of squares.

One way in which a measure of goodness of fit statistic can be constructed, in the case where the variance
of the measurement error is known, is to construct a weighted sum of squared errors:

where σ2 is the known variance of the observation.

This definition is only useful when one has estimates for the error on the measurements, but it leads to a
situation where a chi-square distribution can be used to test goodness of fit, provided that the errors can
be assumed to have a normal distribution.

The reduced chi-squared statistic is simply the chi-squared divided by the number of degrees of freedom:

where ν is the number of degrees of freedom, usually given by N − n − 1, where N is the number of
observations, and n is the number of fitted parameters, assuming that the mean value is an additional
fitted parameter. The advantage of the reduced chi-squared is that it already normalizes for the number of
data points and model complexity.

As a rule of thumb, a large   indicates a poor model fit. However   indicates that the model
is 'over-fitting' the data (either the model is improperly fitting noise, or the error variance has been over-
estimated). A   indicates that the fit has not fully captured the data (or that the error variance
has been under-estimated). In principle a value of   indicates that the extent of the match
between observations and estimates is in accord with the error variance.

Categorical data

The following are examples that arise in the context of categorical data.
Pearson's chi-square test

Pearson's chi-square test uses a measure of goodness of fit which is the sum of differences between
observed and expected outcomefrequencies (that is, counts of observations), each squared and divided by
the expectation:


Oi = an observed frequency (ie count) for the ith bin

Ei = an expected (theoretical) frequency for the ith bin, asserted by the null hypothesis.

The resulting value can be compared to the chi-square distribution to determine the

goodness of fit. In order to determine the degrees of freedom of the chi-squared

distribution, one takes the total number of observed frequencies and subtracts one. For
example, if there are eight different frequencies, one would compare to a chi-squared
with seven degrees of freedom.

Example: equal frequencies of men and women

For example, to test the hypothesis that a random sample of 100 people has been drawn from a population
in which men and women are equal in frequency, the observed number of men and women would be
compared to the theoretical frequencies of 50 men and 50 women. If there were 44 men in the sample and
56 women, then

If the null hypothesis is true (i.e., men and women are chosen with equal probability in the sample),
the test statistic will be drawn from a chi-square distribution with one degree of freedom. Though
one might expect two degrees of freedom (one each for the men and women), we must take into
account that the total number of men and women is constrained (100), and thus there is only one
degree of freedom (2 − 1). Alternatively, if the male count is known the female count is determined,
and vice-versa.

Consultation of the chi-square distribution for 1 degree of freedom shows that the probability of

observing this difference (or a more extreme difference than this) if men and women are equally
numerous in the population is approximately 0.23. This probability is higher than conventional
criteria for statistical significance (.001-.05), so normally we would not reject the null hypothesis
that the number of men in the population is the same as the number of women (i.e. we would
consider our sample within the range of what we'd expect for a 50/50 male/female ratio.)

Binomial case

A binomial experiment is a sequence of independent trials in which the trials can result in one of two
outcomes, success or failure. There aren trials each with probability of success, denoted by p. Provided
that npi ≫ 1 for every i (where i = 1, 2, ..., k), then

This has approximately a chi-squared distribution with k − 1 df. The fact that df = k − 1 is a consequence
of the restriction  . We know there are k observed cell counts, however, once any k − 1 are

known, the remaining one is uniquely determined. Basically, one can say, there are only k − 1 freely
determined cell counts, thus df = k − 1.

Other measures of fit

The likelihood ratio test statistic is a measure of the goodness of fit of a model, judged by whether an
expanded form of the model provides a substantially improved fit.

b. Critical region of a test

A statistical hypothesis test is a method of making decisions using data, whether from a controlled
experiment or an observational study(not controlled). In statistics, a result is called statistically
significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold
probability, the significance level. The phrase "test of significance" was coined by Ronald Fisher:
"Critical tests of this kind may be called tests of significance, and when such tests are available we may
discover whether a second sample is or is not significantly different from the first." Hypothesis testing is
sometimes called confirmatory data analysis, in contrast to exploratory data analysis. In frequency
probability, these decisions are almost always made using null-hypothesis tests (i.e., tests that answer the
question Assuming that the null hypothesis is true, what is the probability of observing a value for the test
statistic that is at least as extreme as the value that was actually observed?) One use of hypothesis testing
is deciding whether experimental results contain enough information to cast doubt on conventional

A result that was found to be statistically significant is also called a positive result; conversely, a result
whose probability under the null hypothesis exceeds the significance level is called a negative result or
a null result.

Statistical hypothesis testing is a key technique of frequentist statistical inference. The Bayesian approach
to hypothesis testing is to base rejection of the hypothesis on the posterior probability.

 Other approaches to reaching a decision based on data are available via decision theory and optimal

The critical region of a hypothesis test is the set of all outcomes which, if they occur, will lead us to
decide that there is a difference. That is, cause the null hypothesis to be rejected in favor of the alternative
hypothesis. The critical region is usually denoted by the letter C.

Clairvoyant card game

A person (the subject) is tested for clairvoyance. He is shown the reverse of a randomly chosen play card
25 times and asked which suit it belongs to. The number of hits, or correct answers, is called X.

As we try to find evidence of his clairvoyance, for the time being the null hypothesis is that the person is
not clairvoyant. The alternative is, of course: the person is (more or less) clairvoyant.

If the null hypothesis is valid, the only thing the test person can do is guess. For every card, the
probability (relative frequency) of guessing correctly is 1/4. If the alternative is valid, the test subject will
predict the suit correctly with probability greater than 1/4. We will call the probability of guessing
correctly p. The hypotheses, then, are:

 null hypothesis       (just guessing)


 alternative hypothesis      (true clairvoyant).

When the test subject correctly predicts all 25 cards, we will consider him clairvoyant, and reject the null
hypothesis. Thus also with 24 or 23 hits. With only 5 or 6 hits, on the other hand, there is no cause to
consider him so. But what about 12 hits, or 17 hits? What is the critical number, c, of hits, at which point
we consider the subject to be clairvoyant? How do we determine the critical value c? It is obvious that
with the choice c=25 (i.e. we only accept clairvoyance when all cards are predicted correctly) we're more
critical than with c=10. In the first case almost no test subjects will be recognized to be clairvoyant, in the
second case, some number more will pass the test. In practice, one decides how critical one will be. That
is, one decides how often one accepts an error of the first kind - a false positive, or Type I error. With c=
25 the probability of such an error is:

and hence, very small. The probability of a false positive is the probability of randomly guessing correctly
all 25 times.

Being less critical, with c=10, gives:

Thus, c=10 yields a much greater probability of false positive.

Before the test is actually performed, the desired probability of a Type I error is determined. Typically,
values in the range of 1% to 5% are selected. Depending on this desired Type 1 error rate, the critical
value c is calculated. For example, if we select an error rate of 1%, c is calculated thus:

From all the numbers c, with this property, we choose the smallest, in order to minimize the probability of
a Type II error, a false negative. For the above example, we select: c = 12.

But what if the subject did not guess any cards at all? Having zero correct answers is clearly an oddity
too. The probability of guessing incorrectly once is equal to p'=(1-p)=3/4. Using the same approach we
can calculate that probability of randomly calling all 25 cards wrong is:

This is highly unlikely (less than 1 in a 1000 chance). While the subject can't guess the cards correctly,
dismissing H0 in favour of H1 would be an error. In fact, the result would suggest a trait on the subject's
part of avoiding calling the correct card. A test of this could be formulated: for a selected 1% error rate
the subject would have to answer correctly at least twice, for us to believe that card calling is based purely
on guessing.

c. Exponential Smoothing Method

Exponential smoothing is a technique that can be applied to time series data, either to produce
smoothed data for presentation, or to make forecasts. The time series data themselves are a
sequence of observations. The observed phenomenon may be an essentially random process, or it
may be an orderly, but noisy, process. Whereas in the simple moving average the past
observations are weighted equally, exponential smoothing assigns exponentially decreasing
weights over time.

Exponential smoothing is commonly applied to financial market and economic data, but it can be
used with any discrete set of repeated measurements. The raw data sequence is often represented
by {xt}, and the output of the exponential smoothing algorithm is commonly written as {st},
which may be regarded as a best estimate of what the next value of x will be. When the sequence
of observations begins at time t = 0, the simplest form of exponential smoothing is given by the

where α is the smoothing factor, and 0 < α < 1.

The simple moving average

Intuitively, the simplest way to smooth a time series is to calculate a simple, or unweighted, moving
average. The smoothed statistic st is then just the mean of the last k observations:

where the choice of an integer k > 1 is arbitrary. A small value of k will have less of a smoothing effect
and be more responsive to recent changes in the data, while a larger k will have a greater smoothing
effect, and produce a more pronounced lag in the smoothed sequence. One disadvantage of this technique
is that it cannot be used on the first k −1 terms of the time series.

The weighted moving average

A slightly more intricate method for smoothing a raw time series {xt} is to calculate a weighted moving
average by first choosing a set of weighting factors

 such that 

and then using these weights to calculate the smoothed statistics {st}:

In practice the weighting factors are often chosen to give more weight to the most recent terms in the time
series and less weight to older data. Notice that this technique has the same disadvantage as the simple
moving average technique (i.e., it cannot be used until at least kobservations have been made), and that it
entails a more complicated calculation at each step of the smoothing procedure. In addition to this
disadvantage, if the data from each stage of the averaging is not available for analysis, it may be difficult
if not impossible to reconstruct a changing signal accurately (because older samples may be given less
weight). If the number of stages missed is known however, the weighting of values in the average can be
adjusted to give equal weight to all missed samples to avoids this issue.


The exponential moving average

The simplest form of exponential smoothing is given by the formulae

where α is the smoothing factor, and 0 < α < 1. In other words, the smoothed statistic st is a simple
weighted average of the previous observation xt-1 and the previous smoothed statistic st−1. The
term smoothing factor applied to α here is something of a misnomer, as larger values of α actually
reduce the level of smoothing. In the limiting case with α = 1 the output series is just the same as the
original series. Simple exponential smoothing is easily applied, and it produces a smoothed statistic
as soon as two observations are available.

Values of α close to one have less of a smoothing effect and give greater weight to recent changes in
the data, while values of α closer to zero have a greater smoothing effect and are less responsive to
recent changes. There is no formally correct procedure for choosing α. Sometimes the statistician's
judgment is used to choose an appropriate factor. Alternatively, a statistical technique may be used
to optimizethe value of α. For example, the method of least squares might be used to determine the
value of α for which the sum of the quantities (sn-1 − xn-1)2 is minimized.

This technique technically does not share disadvantage where it cannot be used until a minimum
number of observations have been made, though in practice a "good average" will not be achieved
until several samples have been averaged together (a constant signal will take
approximately 3/α stages to reach 95% of the actual value). To accurately reconstruct the original
signal without information loss all stages of the exponential moving average must also be available
(because older samples decay in weighting exponentially). In the simple moving average some
samples can be skipped without as much loss of information, due to the constant weighting of
samples within the average. If a known number of samples will be missed, a weighted average can
be adjusted for this as well, by giving equal weight to the new sample and all those to be skipped.

This simple form of exponential smoothing is also known as an exponentially weighted moving
average (EWMA). Technically it can also be classified as an Autoregressive integrated moving
average (ARIMA) (0,1,1) model with no constant term.

By direct substitution of the defining equation for simple exponential smoothing back into itself we find


In other words, as time passes the smoothed statistic st becomes the weighted average of a greater and
greater number of the past observations xt−n, and the weights assigned to previous observations are in
general proportional to the terms of the geometric progression {1, (1 − α), (1 − α)2, (1 − α)3, …}.
A geometric progression is the discrete version of an exponential function, so this is where the name for
this smoothing method originated.

Double exponential smoothing

Simple exponential smoothing does not do well when there is a trend in the data.

In such situations, double exponential smoothing can be used.

Again, the raw data sequence of observations is represented by {xt}, beginning at time t = 0. We use {st}
to represent the smoothed value for time t, and {bt} is our best estimate of the trend at time t. The output
of the algorithm is now written as Ft+m, an estimate of the value of x at time t+m, m>0 based on the raw
data up to time t. Double exponential smoothing is given by the formulas

where α is the data smoothing factor, 0 < α < 1, β is the trend smoothing factor, 0 < β < 1, and b0 is taken
as (xn-1 - x0)/(n - 1) for somen > 1. Note that F0 is undefined (there is no estimation for time 0), and
according to the definition F1=s0+b0, which is well defined, thus further values can be evaluated.


