You are on page 1of 12

Drive:

Fall 2015
Program:
MBA
Semester:
1st
Subject Code: MB0040
Subject Name: Statistics for Management

Q1- Statistics plays a vital role in almost every facet of human life. Describe the
functions of Statistics. Explain the applications of statistics.
Ans:
Meaning of Statistics: The practice or science of collecting and analysing numerical data in
large quantities, especially for the purpose of inferring proportions in a whole from those in a
representative sample.
Functions of statistics:
1. Presents facts in simple form: Statistics presents facts and figures in a definite form. That
makes the statement logical and convincing than mere description. It condenses the whole
mass of figures into a single figure. This makes the problem intelligible.
2. Reduces the Complexity of data: Statistics simplifies the complexity of data. The raw
data are unintelligible. We make them simple and intelligible by using different statistical
measures. Some such commonly used measures are graphs, averages, dispersions, skewness,
kurtosis, correlation and regression etc. These measures help in interpretation and drawing
inferences. Therefore, statistics enables to enlarge the horizon of one's knowledge.
3. Facilitates comparison: Comparison between different sets of observation is an important
function of statistics. Comparison is necessary to draw conclusions as Professor Boddingtons
rightly points out. the object of statistics is to enable comparison between past and present
results to ascertain the reasons for changes, which have taken place and the effect of such
changes in future. So to determine the efficiency of any measure comparison is necessary.
Statistical devices like averages, ratios, coefficients etc. are used for the purpose of
comparison.
4. Testing hypothesis: Formulating and testing of hypothesis is an important function of
statistics. This helps in developing new theories. So statistics examines the truth and helps in
innovating new ideas.
5. Formulation of Policies: Statistics helps in formulating plans and policies in different
fields. Statistical analysis of data forms the beginning of policy formulations. Hence,
statistics is essential for planners, economists, scientists and administrators to prepare
different plans and programmes.
6. Forecasting: The future is uncertain. Statistics helps in forecasting the trend and
tendencies. Statistical techniques are used for predicting the future values of a variable. For
example a producer forecasts his future production on the basis of the present demand
conditions and his past experiences. Similarly, the planners can forecast the future population
etc. considering the present population trends.
7. Derives valid inferences: Statistical methods mainly aim at deriving inferences from an
enquiry. Statistical techniques are often used by scholars planners and scientists to evaluate
different projects. These techniques are also used to draw inferences regarding population
parameters on the basis of sample information.

Applications of statistics:
Actuarial science is the discipline that applies mathematical and statistical methods to
assess risk in the insurance and finance industries.
Astrostatistics is the discipline that applies statistical analysis to the understanding of
astronomical data.
Biostatistics is a branch of biology that studies biological phenomena and
observations by means of statistical analysis, and includes medical statistics.
Business analytics is a rapidly developing business process that applies statistical
methods to data sets (often very large) to develop new insights and understanding of
business performance & opportunities
Chemometrics is the science of relating measurements made on a chemical system or
process to the state of the system via application of mathematical or statistical
methods.
Demography is the statistical study of all populations. It can be a very general science
that can be applied to any kind of dynamic population, that is, one that changes over
time or space.
Econometrics is a branch of economics that applies statistical methods to the
empirical study of economic theories and relationships.
Environmental statistics is the application of statistical methods to environmental
science. Weather, climate, air and water quality are included, as are studies of plant
and animal populations.
Epidemiology is the study of factors affecting the health and illness of populations,
and serves as the foundation and logic of interventions made in the interest of public
health and preventive medicine.
Geostatistics is a branch of geography that deals with the analysis of data from
disciplines such as petroleum
geology,hydrogeology, hydrology, meteorology, oceanography, geochemistry, geogra
phy.
Machine Learning
Operations research (or Operational Research) is an interdisciplinary branch of
applied mathematics and formal science that uses methods such as mathematical
modelling, statistics, and algorithms to arrive at optimal or near optimal solutions to
complex problems.
Population ecology is a sub-field of ecology that deals with the dynamics of species
populations and how these populations interact with the environment.
Psychometric is the theory and technique of educational and psychological
measurement of knowledge, abilities, attitudes, and personality traits.
Quality control reviews the factors involved in manufacturing and production; it can
make use of statistical sampling of product items to aid decisions in process control or
in accepting deliveries.
Quantitative psychology is the science of statistically explaining and changing mental
processes and behaviours in humans.
Reliability Engineering is the study of the ability of a system or component to perform
its required functions under stated conditions for a specified period of time
Statistical finance, an area of econophysics, is an empirical attempt to shift finance
from its normative roots to a positivist framework using exemplars from statistical
physics with an emphasis on emergent or collective properties of financial markets.

Q2-

Statistical mechanics is the application of probability theory, which includes


mathematical tools for dealing with large populations, to the field of mechanics,
which is concerned with the motion of particles or objects when subjected to a force.
Statistical physics is one of the fundamental theories of physics, and uses methods
of probability theory in solving physical problems.
a) Explain the approaches to define probability.
b) State the addition and multiplication rules of probability giving an example of
each case.

Ans:
a):
1. The classical definition: Let the sample space (denoted by ) be the set of all possible
distinct outcomes to an experiment. The probability of some event is

provided all points in


example, when a die is rolled the probability of getting a 2 is
is a 2.

are equally likely. For


because one of the six faces

2. The relative frequency definition: The probability of an event is the proportion (or
fraction) of times the event occurs in a very long (theoretically infinite) series of repetitions
of an experiment or process. For example, this definition could be used to argue that the
probability of getting a 2 from a rolled die is

3. The subjective probability definition: The probability of an event is a measure of how


sure the person making the statement is that the event will happen. For example, after
considering all available data, a weather forecaster might say that the probability of rain
today is 30% or 0.3.
b):
The Addition Rule of Probability, which is a rule for finding the union of two events, either
mutually exclusive or non-mutually exclusive. Each of these scenarios represents an event in
probability. The first event, rolling the die on either a 3 or a 6, is a mutually exclusive event,
which are events that cannot happen at the same time. The second event, picking a card out of
a deck that is black or a seven is an example of non-mutually exclusive events, which are
events that can happen separately or at the same time
To find the probability of mutually exclusive events, follow these steps:
1. Find the total of possible outcomes
2. Find the desired outcomes
3. Create a ratio for each event
4. Add the ratios, or fractions, of each event

5. Second, find the desired outcomes. Cheyenne needs to roll a 3 or a 6. Therefore, a 3 or


a 6 would be the desired outcome. A 3 appears on a six-sided die once and so does a
6. Remember this information for the next step.
6. Third, create a ratio for each event. The first event, rolling a 3 would be a ratio of 1/6,
because the die only has one side with three dots. The second event, rolling a 6, would
also be a ratio of 1/6, because the die only has one side with six dots.
7. Fourth, add the ratios, or fractions, of each event. This step will give you the
probability of rolling a die and getting a 3 or a 6.
8. 1/6 + 1/6 = 2/6 or 1/3
Therefore, Cheyenne has a 1 in 3 chance of rolling a 3 or a 6. Once she rolls a 3 or a 6,
Cheyenne can land on a space that allows her to pick a card. She needs to pick a black card,
or a seven.
Non-Mutually Exclusive Events: Picking a black card or a seven card out of a deck of
regular playing cards is an example of non-mutual events. If you are looking for the
probability of two events happening at the same time, this is called the intersection of two
events
The probability of event A or B is equal to the probability of event A plus the probability of
event B minus the probability of event A and B.
To find the probability of non-mutually exclusive events, follow these steps:
1. Find the total of possible outcomes
2. Find the desired outcomes
3. Create a ratio for each event
4. Add the ratios, or fractions, of each event
5. Subtract the overlap of the two events
First, the total number of possible outcomes of a deck of regular playing cards is 52, since
there are 52 cards in a regular deck.
Second, find the desired outcomes. Cheyenne needs to select a black card or a seven card.
Therefore, a black or a 7 card would be the desired outcome. There are two suits that are
black cards: spades and clubs. There are 13 cards for each suit. Therefore, the desired
outcome of possibilities for a black card is 26. There are four sevens in a regular deck of
playing cards, one seven for each suit. Therefore, the desired outcome possibilities for a
seven card are 4.
Third, create a ratio for each event. The first event, selecting a black card, would be a ratio of
26/52. The second event, selecting a seven card, would be a ratio of 4/52. I got these ratios by
using the desired outcome number as the numerator and the total possible outcomes as the
denominator.
Fourth, add the ratios, or fractions, of each event like this:
26/52 + 4/52 = 30/52
Q3-

a) The procedure of testing hypothesis requires a researcher to adopt several


steps. Describe in brief all such steps.
b) Explain the components of time series.

Ans:
a):
Step 1: State the Null Hypothesis.
The null hypothesis can be thought of as the opposite of the "guess" the
research made (in this example the biologist thinks the plant height will be

different for the fertilizers). So the null would be that there will be no
difference among the groups of plants. Specifically in more statistical
language the null for an ANOVA is that the means are the same. We state
the Null hypothesis as:
H0:1=2==k
Step 2: State the Alternative Hypothesis.
HA: treatment level means not all equal
HA: treatment level means not all equal
The reason we state the alternative hypothesis this way is that if the Null is rejected, there are
many possibilities.
For example, 12==k12==k is one possibility, as
is 1=23==k1=23==k. Many people make the mistake of stating the
Alternative Hypothesis as: 12k12k which says that every mean differs
from every other mean. This is a possibility, but only one of many possibilities. To cover all
alternative outcome
Step 3: Set .
If we look at what can happen in a hypothesis test, we can construct the following
contingency table:
In Reality
Decision

H0 is TRUE

H0 is FALSE

Accept H0

OK

Type II Error
= probability of Type II Error

Reject H0

Type I Error
= probability of Type I Error

OK

Step 4: Collect Data


Remember the importance of recognizing whether data is collected through an experimental
design or observational.
Step 5: Calculate a test statistic.
For categorical treatment level means, we use an F statistic, named after R.A. Fisher. We will
explore the mechanics of computing the F statistic beginning in Lesson 2. The F value we get
from the data is labelled Fcalculated.

Step 6: Construct Acceptance / Rejection regions.


As with all other test statistics, a threshold (critical) value of F is established. This F value
can be obtained from statistical tables, and is referred to as Fcritical or FF. As a reminder,

this critical value is the minimum value for the test statistic (in this case the F test) for us to
be able to reject the null.
The F distribution, FF, and the location of Acceptance / Rejection regions are shown in the
graph below:

Step 7: Based on steps 5 and 6, draw a conclusion about H0.


If the Fcalculated from the data is larger than the F, then you are in the Rejection region and you
can reject the Null Hypothesis with (1-) level of confidence.
Note that modern statistical software condenses step 6 and 7 by providing a p-value. The pvalue here is the probability of getting an Fcalculated even greater than what you observe. If by
chance, the Fcalculated = FF, then the p-value would exactly equal to . With
larger Fcalculated values, we move further into the rejection region and the p-value becomes less
than . So the decision rule is as follows:
If the p-value obtained from the ANOVA is less than , then Reject H0 and Accept HA.
B): The four components of time series are:
1. Secular trend
2. Seasonal variation
3. Cyclical variation
4. Irregular variation
Secular trend: A time series data may show upward trend or downward trend for a period of
years and this may be due to factors like increase in population, change in technological
progress, large scale shift in consumers demands, etc. For example, population increases
over a period of time, price increases over a period of years, production of goods on the
capital market of the country increases over a period of years.
Seasonal variation: Seasonal variations are short-term fluctuation in a time series which
occur periodically in a year. This continues to repeat year after year. The major factors that
are responsible for the repetitive pattern of seasonal variations are weather conditions and

customs of people. More woollen clothes are sold in winter than in the season of summer
regardless of the trend we can observe that in each year more ice creams are sold in summer
and very little in winter season. The sales in the departmental stores are more during festive
seasons that in the normal days.
Cyclical variations: Cyclical variations are recurrent upward or downward movements in a
time series but the period of cycle is greater than a year. Also these variations are not regular
as seasonal variation. There are different types of cycles of varying in length and size. The
ups and downs in business activities are the effects of cyclical variation. A business cycle
showing these oscillatory movements has to pass through four phases-prosperity, recession,
depression and recovery. In a business, these four phases are completed by passing one to
another in this order.
Irregular variation: Irregular variations are fluctuations in time series that are short in
duration, erratic in nature and follow no regularity in the occurrence pattern. These variations
are also referred to as residual variations since by definition they represent what is left out in
a time series after trend, cyclical and seasonal variations. Irregular fluctuations results due to
the occurrence of unforeseen event
Q4- a) What is a Chi-square test? Point out its applications. Under what conditions is
this test applicable?
Ans:
a): Chi Square: Statistical method assessing the goodness of fit between a set of observed
values and those expected theoretically.
ApplicationChi-square test for categorical variables determines whether there is a difference in the
population proportions between two or more groups. In the medical literature, the Chi-square
is used most commonly to compare the incidence (or proportion) of a characteristic in one
group to the incidence (or proportion) of a characteristic in other group(s).
For example, you might use the Chi-Square test to compare the incidence PONV between
patients that received ondansetron, patients that received droperidol, and patients that
received a placebo.
This approach consists of four steps:
1. State the hypotheses: Every hypothesis test requires the analyst to state a null
hypothesis (H0) and an alternative hypothesis (Ha). The hypotheses are stated in such
a way that they are mutually exclusive. That is, if one is true, the other must be false;
and vice versa.
2. Formulate an analysis plan: The analysis plan describes how to use sample data to
accept or reject the null hypothesis. The plan should specify the following elementsSignificance level > Often, researchers choose significance levels equal to 0.01, 0.05,
or 0.10; but any value between 0 and 1 can be used.
Test method > Use the chi-square goodness of fit test to determine whether observed
sample frequencies differ significantly from expected frequencies specified in the null
hypothesis.

3. Analyze sample data: Using sample data, find the degrees of freedom, expected
frequency counts, test statistic, and the P-value associated with the test statistic.
Degrees of freedom. The degrees of freedom (DF) is equal to the number of
levels (k) of the categorical variable minus 1: DF = k - 1
Expected frequency counts. The expected frequency counts at each level of the
categorical variable are equal to the sample size times the hypothesized
proportion from the null hypothesis Ei = npi
Where Ei is the expected frequency count for the ith level of the categorical
variable, n is the total sample size, and pi is the hypothesized proportion of
observations in level i.
Test statistic. The test statistic is a chi-square random variable (2) defined by
the following equation.
2 = [ (Oi - Ei) 2 / Ei ]
Where Oi is the observed frequency count for the ith level of the categorical
variable, and Ei is the expected frequency count for the ith level of the
categorical variable.
P-value. The P-value is the probability of observing a sample statistic as
extreme as the test statistic. Since the test statistic is a chi-square, use the ChiSquare Distribution Calculator to assess the probability associated with the test
statistic. Use the degrees of freedom computed above.
4. Interpret results: If the sample findings are unlikely, given the null hypothesis, the
researcher rejects the null hypothesis. Typically, this involves comparing the P-value
to the significance level, and rejecting the null hypothesis when the P-value is less
than the significance level.
b): Types of measurement scales:
There are four measurement scales (or types of data): nominal, ordinal, interval and ratio.
These are simply ways to categorize different types of variables
Nominal- Nominal scales could simply be called labels. Scales are mutually exclusive (no
overlap) and none of them have any numerical significance. A good way to remember all of
this is that nominal sounds a lot like name and nominal scales are kind of like names or
labels.
Ordinal- With ordinal scales, it is the order of the values is whats important and significant,
but the differences between each one is not really known. In each case, we know that a #4 is
better than a #3 or #2, but we dont knowand cannot quantifyhow much better it is. For
example, is the difference between OK and Unhappy the same as the difference between
Very Happy and Happy? We cant say. Ordinal scales are typically measures of nonnumeric concepts like satisfaction, happiness, discomfort, etc. Ordinal is easy to remember
because it sounds like order and thats the key to remember with ordinal scalesit is
the order that matters, but thats all you really get from these.
Interval- Interval scales are numeric scales in which we know not only the order, but also the
exact differences between the values. The classic example of an interval scale
is Celsius temperature because the difference between each value is the same. For example,
the difference between 60 and 50 degrees is a measurable 10 degrees, as is the difference
between 80 and 70 degrees. Time is another good example of an interval scale in which
the increments are known, consistent, and measurable. Interval scales are nice because the

realm of statistical analysis on these data sets opens up. For example, central tendency can be
measured by mode, median, or mean; standard deviation can also be calculated.
Ratio- Ratio scales are the ultimate nirvana when it comes to measurement scales because
they tell us about the order, they tell us the exact value between units, and they also have an
absolute zerowhich allows for a wide range of both descriptive and inferential statistics to be
applied. At the risk of repeating myself, everything above about interval data applies to ratio
scales + ratio scales have a clear definition of zero. Good examples of ratio variables include
height and weight.
Q5- Business forecasting acquires an important place in every field of the economy.
Explain the objectives and theories of Business forecasting.
Ans: Business forecasting refers to the analysis of past and present economic conditions with
the object of drawing inferences about probable future business conditions. the process of
making definite estimates of future course of events is referred to as forecasting and the
figure or statements obtained from the process is known as forecast; future course of events
is rarely known.
The Objectives of Business Forecasting:
In the narrow sense, the objective of forecasting is to produce better forecasts. But in the
broader sense, the objective is to improve organizational performancemore revenue, more
profit, increased customer satisfaction. Better forecasts, by themselves, are of no inherent
value if those forecasts are ignored by management or otherwise not used to improve
organizational performance.
Theories of Business Forecasting:
1. Sequence or time-lag theory - It is the most important theory of business forecasting.
It is totally based on the assumption that most of the business data have the lead
relationship changes in business are successive and not simultaneous. There is a timelag between different movements, for example, the expenditure on advertisement may
not at once lead to increase in sales. Similarly, when government makes use of deficit
financing it leads to inflationary pressure-the purchasing power of people goes up-the
wholesale prices. The retail prices start increasing. With the rise in retail prices the
cost of living goes up and with it there is a demand for increased wage. Thus, one
factor, more money in circulation, has affected different fields of economic activity
not simultaneously but successively, similarly, when the excise duties are increased by
the government they result in increases in prices which would lead to higher demand
for wages.
2. Action and reaction theory- This Action and reaction theory is based on two
assumptions; every action has a reaction and magnitude of the original action
influences the reaction. Thus if the price of Rice has gone up above a certain level in a
certain time period, there is likelihood that after some time it will go down below the
normal level, thus, according to this theory a certain level of business activity is
normal-sub normal or abnormal conditions cannot remaining so for ever- there is a
bound to be reaction to them. And hence, we find four phases of a business cycle:
Prosperity, Decline, Depression and Improvement.
3. Economic rhythm theory- The basic assumption of this theory is that the history
repeats itself and hence the exponents of this theory believe that economic phenomena
behave in a rhythmic order. Cycles of early the similar intensity and duration tend to
recur. Thus, the available historical data have to be analyzed into their component

parts and various types of fluctuations. Influencing them has to be segregated. A trend
is then obtained that will represent a long-term tendency or growth of decline. This
trend line is projected a number of years into the future either by the freehand
technique or by the mathematical technique. This is completed on the assumption that
the trend line represents the normal growth or decline of the series.
4. Specific historical analogy- This theory is hazed on a more realistic assumption, that
all the business cycles are not uniform in amplitude or duration and as such the use of
history is made not by projecting any economic rhythm into the future, but by
selecting some specific previous conditions which has many of the earmarks of the
present and concluding that what happened in the earlier situation will happen in the
present one also. What is done is that a time series relating to a data in question is
thoroughly scrutinized and from it such period is selected in which conditions were
similar to those prevailing at the time of making idea of the likely course which the
phenomenon in question would follow. For example after world war many people
forecast a depression because of world war.
5. Cross-section analysis- This theory is based on the knowledge and interpretation of
the current forces rather than projection of past trends. The theory assumes that no
two cycles are similar. By the like causes always give like results. All the factors
bearing upon a given condition are assembled and relying upon the knowledge of
economic processes. The forecaster concludes whether the condition is favourable or
not, immediate recognition is given to the fact that business conditions are shaped by
simultaneous inflationary and deflationary forces. The Predominance of inflationary
forces results in booms, whereas predominance of deflationary forces leads to
depression. The forecaster who utilizes this technique enumerates stable forces and a
third which sets forth deflationary forces on the basis of judgment.

Q6-

a) What is analysis of variance? What are the assumptions of this technique?


b) Three samples below have been obtained from normal populations with equal
variances. Test the hypothesis at 5% level that the population means are equal.
A
B
C
8
7
12
10
5
9
7
10
13
14
9
12
11
9
14
[The table value of F at 5% level of significance for V1 = 2 and V2 = 12 is 3.88]

Ans:
a):
Analysis of variance (ANOVA): It is a collection of statistical models used to analyze the
differences among group means and their associated procedures (such as "variation" among
and between groups)
Assumptions for ANOVA
Each group sample is drawn from a normally distributed population.

All populations have a common variance.


All samples are drawn independently of each other.
Within each sample, the observations are sampled randomly and
independently of each other.
The effect of various component are additive.

b): Solution:
Let H0: There is no significant difference in the means of three samples
.
X1
8
10
7
14
11
X1 = 50

X2
7
5
10
9
9
X2 = 40

X3
12
9
13
12
14
X3 = 60

T= Sum of all observations = 150

Correction Factor = T2 / N = 1502 / 15 = 1500

SST (Total Sum of Squares) = Sum of squares of all observations T2 / N


= (82 + 72 + 122 + 102 + .................... + 142) 1500 = 1600 1500 = 100

Sum of the squares of error between the columns (samples):

SSC

[(X1)2 / n1 + (X2)2 / n2 + (X3)2 / n3 + .......... + (Xn)2 / nn T2 / N]

502 / 5 + 402 / 5 + 602 / 5 1500 = 1540 1500 = 40

Sum of the squares of the error within columns (samples):

SSE = SST SSC = 100 40 = 60

Variance between samples:

MSC = SSC / (k-1) = 40 / (3-1) = 40/2 = 20

The degree of freedom = (k-1, n-k) = (2 , 12)


[k is the number of columns and n is the total number of observations]

ANOVA Table:

Source of
Variation

Sum of Squares

df

Mean Square

F- value

Between

SSC = 40

MSC = 20

Fcal = 20/5= 4

Within

SSE = 60

12

MSE = 5

Total

TSS = 100

14

F table value for degrees of freedom (2, 12) [v1 = 2, v2 = 12] at 5% level of significance is
3.88. Since F table value is smaller than the F calculated value, we reject the null hypothesis
and conclude that sample means are not equal.

You might also like