You are on page 1of 15

STATISTICS FOR MANAGEMENT

MB0024
SET – 2

MBA – 1 SEM

Name Mohammed Roohul Ameen


Roll Number
Learning Center SMU Riyadh (02543)
Subject Statistics For Management
Date of Submission 15th August 2009
Assignment Number MB0024
This page is intentionally left blank

Mohammed Roohul Ameen 2 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
1. What do you mean by sample survey? What are the different sampling methods? Briefly describe
them.

Sampling is that part of statistical practice concerned with the selection of individual observations
intended to yield some knowledge about a population of concern, especially for the purposes of
statistical inference. Each observation measures one or more properties (weight, location, etc.) of an
observable entity enumerated to distinguish objects or individuals. Survey weights often need to be
applied to the data to adjust for the sample design. Results from probability theory and statistical
theory are employed to guide practice. In business, sampling is widely used for gathering information
about a population.

Types of Sampling
 Simple random sampling
 Systematic sampling
 Stratified sampling
 Probability proportional to size sampling
 Cluster sampling
 Matched random sampling
 Quota sampling
 Mechanical sampling
 Convenience sampling
 Line-intercept sampling
 Panel sampling
 Event Sampling Methodology

Simple random sampling In a simple random sample ('SRS') of a given size, all such subsets of the
frame are given an equal probability. Each element of the frame thus has an equal probability of
selection: the frame is not subdivided or partitioned. Furthermore, any given pair of elements has the
same chance of selection as any other such pair (and similarly for triples, and so on). This minimizes
bias and simplifies analysis of results. In particular, the variance between individual results within the
sample is a good indicator of variance in the overall population, which makes it relatively easy to
estimate the accuracy of results.

Systematic sampling relies on arranging the target population according to some ordering scheme and
then selecting elements at regular intervals through that ordered list. Systematic sampling involves a
random start and then proceeds with the selection of every kth element from then onwards. In this
case, k=(population size/sample size). It is important that the starting point is not automatically the first
in the list, but is instead randomly chosen from within the first to the kth element in the list. A simple
example would be to select every 10th name from the telephone directory (an 'every 10th' sample,
also referred to as 'sampling with a skip of 10').

Mohammed Roohul Ameen 3 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Stratified sampling Where the population embraces a number of distinct categories, the frame can be
organized by these categories into separate "strata." Each stratum is then sampled as an independent
sub-population, out of which individual elements can be randomly selected.

Probability proportional to size sampling In some cases the sample designer has access to an "auxiliary
variable" or "size measure", believed to be correlated to the variable of interest, for each element in
the population. This data can be used to improve accuracy in sample design. One option is to use the
auxiliary variable as a basis for stratification, as discussed above.

Cluster sampling Sometimes it is cheaper to 'cluster' the sample in some way e.g. by selecting
respondents from certain areas only, or certain time-periods only. (Nearly all samples are in some
sense 'clustered' in time - although this is rarely taken into account in the analysis.)

Cluster sampling is an example of 'two-stage sampling' or 'multistage sampling': in the first stage a
sample of areas is chosen; in the second stage a sample of respondents within those areas is selected.

Matched random sampling A method of assigning participants to groups in which pairs of participants
are first matched on some characteristic and then individually assigned randomly to groups.

The Procedure for Matched random sampling can be briefed with the following contexts,

1. Two samples in which the members are clearly paired, or are matched explicitly by the
researcher. For example, IQ measurements or pairs of identical twins.
2. Those samples in which the same attribute, or variable, is measured twice on each subject,
under different circumstances. Commonly called repeated measures. Examples include the
times of a group of athletes for 1500m before and after a week of special training; the milk
yields of cows before and after being fed a particular diet.

Quota sampling, the population is first segmented into mutually exclusive sub-groups, just as in
stratified sampling. Then judgment is used to select the subjects or units from each segment based on a
specified proportion. For example, an interviewer may be told to sample 200 females and 300 males
between the age of 45 and 60.

It is this second step which makes the technique one of non-probability sampling. In quota sampling
the selection of the sample is non-random. For example interviewers might be tempted to interview
those who look most helpful. The problem is that these samples may be biased because not everyone
gets a chance of selection. This random element is its greatest weakness and quota versus probability
has been a matter of controversy for many years

Mohammed Roohul Ameen 4 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Mechanical sampling is typically used in sampling solids, liquids and gases, using devices such as grabs,
scoops, thief probes, the COLIWASA and riffle splitter. Care is needed in ensuring that the sample is
representative of the frame. Much work in the theory and practice of mechanical sampling was
developed by Pierre Gy and Jan Visman.

Convenience sampling (sometimes known as grab or opportunity sampling) is a type of nonprobability


sampling which involves the sample being drawn from that part of the population which is close to
hand. That is, a sample population selected because it is readily available and convenient. The
researcher using such a sample cannot scientifically make generalizations about the total population
from this sample because it would not be representative enough.

Panel sampling is the method of first selecting a group of participants through a random sampling
method and then asking that group for the same information again several times over a period of time.
Therefore, each participant is given the same survey or interview at two or more time points; each
period of data collection is called a "wave". This sampling methodology is often chosen for large scale
or nation-wide studies in order to gauge changes in the population with regard to any number of
variables from chronic illness to job stress to weekly food expenditures.

Event Sampling Methodology (ESM) is a new form of sampling method that allows researchers to
study ongoing experiences and events that vary across and within days in its naturally-occurring
environment. Because of the frequent sampling of events inherent in ESM, it enables researchers to
measure the typology of activity and detect the temporal and dynamic fluctuations of work
experiences.

Mohammed Roohul Ameen 5 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
2. What is the different between correlation and regression? What do you understand by Rank
Correlation? When we use rank correlation and when we use Pearsonian Correlation Coefficient? Fit
a linear regression line in the following data –
X 12 15 18 20 27 34 28 48
Y 123 150 158 170 180 184 176 130

Correlation
In statistics, correlation (often measured as a correlation coefficient, ρ) indicates the strength and
direction of a linear relationship between two random variables. That is in contrast with the usage of
the term in colloquial speech, which denotes any relationship, not necessarily linear. In general
statistical usage, correlation or co-relation refers to the departure of two random variables from
independence. In this broad sense there are several coefficients, measuring the degree of correlation,
adapted to the nature of the data.

Correlation analysis deals with


1) Measuring the relationship between variables.
2) Testing the relationship for its significance.
3) Giving confidence interval for population correlation measure.

Regression
In statistics, regression refers to any approach to modeling the relationship between one or more
variables denoted y and one or more variables denoted X, such that the model depends linearly on the
unknown parameters to be estimated from the data. Such a model is called a "linear model." Most
commonly, linear regression refers to a model in which the conditional mean of y given the value of X is
an affine function of X. Less commonly, linear regression could refer to a model in which the median, or
some other quantile of the conditional distribution of y given X is expressed as a linear function of X.
Like all forms of regression analysis, linear regression focuses on the conditional probability distribution
of y given X, rather than on the joint probability distribution of y and X, which is the domain of
multivariate analysis.

Different between correlation and regression

In probability theory and statistics, correlation, (often measured as a correlation coefficient), indicates
the strength and direction of a linear relationship between two random variables. In general statistical
usage, correlation or co-relation refers to the departure of two variables from independence. In this
broad sense there are several coefficients, measuring the degree of correlation, adapted to the nature
of data.

A number of different coefficients are used for different situations. The best known is the Pearson
product-moment correlation coefficient, which is obtained by dividing the covariance of the two
variables by the product of their standard deviations. Despite its name, it was first introduced by
Francis Galton.

Mohammed Roohul Ameen 6 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Linear regression is a form of regression analysis in which observational data are modeled by a least
squares function which is a linear combination of the model parameters and depends on one or more
independent variables. In simple linear regression the model function represents a straight line. The
results of data fitting are subject to statistical analysis.

Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often
denoted by the Greek letter ρ (rho) or as rs, is a nonparametric measure of correlation – that is, it
assesses how well an arbitrary monotonic function could describe the relationship between two
variables, without making any other assumptions about the particular nature of the relationship
between the variables. Certain other measures of correlation are parametric in the sense of being
based on possible relationships of a parameterized form, such as a linear relationship.

In principle, ρ is simply a special case of the Pearson product-moment coefficient in which two sets of
data Xi and Yi are converted to rankings xi and yi before calculating the coefficient. In practice,
however, a simpler procedure is normally used to calculate ρ. The raw scores are converted to ranks,
and the differences di between the ranks of each observation on the two variables are calculated.
If there are no tied ranks, then ρ is given by:

Where:
di = xi − yi = the difference between the ranks of corresponding values Xi and Yi, and
n = the number of values in each data set (same for both sets).

If tied ranks exist, classic Pearson's correlation coefficient between ranks has to be used instead of this
formula.

One has to assign the same rank to each of the equal values. It is an average of their positions in the
ascending order of the values.

Mohammed Roohul Ameen 7 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Conditions under which P.E can be used.
1. Samples should be drawn from a normal population.
2. The value of “r” must be determined from sample values.
3. Samples must have been selected at random.

X 12 15 18 20 27 34 28 48
Y 123 150 170 180 184 184 176 130

Linear Regression Line for the above data can be plotted as :

Total Numbers: 8
Slope (b):0.16701
Y-Intercept (a): 154.65
Regression Equation: 154.66 + 0.17x

Regression Equation(y) = a + bx
Regression equation = 154.66+0.17x

Suppose if we want to know the approximate y value for the variable x = 20.
Then we can substitute the value in the above equation.

= 154.65795 + 0.17x20
= 158.05795

Mohammed Roohul Ameen 8 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
3. What do you mean by business forecasting? What are the different methods of business
forecasting? Describe the effectiveness of time-series analysis as a mode of business forecasting.
Describe the method of moving averages.

Business forecasting has always been one component of running an enterprise. However, forecasting
traditionally was based less on concrete and comprehensive data than on face-to-face meetings and
common sense. In recent years, business forecasting has developed into a much more scientific
endeavor, with a host of theories, methods, and techniques designed for forecasting certain types of
data. The development of information technologies and the Internet propelled this development into
overdrive, as companies not only adopted such technologies into their business practices, but into
forecasting schemes as well.

Business forecasting involves a wide range of tools, including simple electronic spreadsheets;
enterprise resource planning (ERP) and electronic data interchange (EDI) networks, advanced supply
chain management systems, and other Web-enabled technologies. The practice attempts to pinpoint
key factors in business production and extrapolate from given data sets to produce accurate
projections for future costs, revenues, and opportunities. This normally is done with an eye toward
adjusting current and near-future business practices to take maximum advantage of expectations.

 Time Series Analysis is also used for the purpose of making business forecasting. The
forecasting through time series analysis is possible only when the business data of various years
are available which reflects a definite trend and seasonal variation.

 Extrapolation is the simplest method of business forecasting. By extrapolation, a businessman


finds out the possible trend of demand of his goods and about their future price trends also.
The accuracy of extrapolation depends on two factors: i) Knowledge about the fluctuations of
the figures, ii) Knowledge about the course of events relating to the problem under
consideration.

 Regression Analysis The regression approach offers many valuable contributions to the solution
of the forecasting problem. It is the means by which we select from among the any possible
relationships between variables in a complex economy those which will be useful for
forecasting. Regression relationship may involve one predicted or dependent and one
independent variables simple regression, or it may involve relationships between the variable to
be forecast and several independent variables under multiple regressions. Statistical techniques
to estimate the regression equations are often fairly complex and time-consuming but there are
many computer programs now available that estimate simple and multiple regressions quickly.

 Modern Econometric Methods Econometric techniques, which originated in the eighteenth


century, have recently gained in popularity for forecasting. The term econometrics refers to the
application of mathematical economic theory and statistical procedures to economic data in

Mohammed Roohul Ameen 9 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
order to verify economic theorems. Models take the form of a set of simultaneous equations.
The values of the constants in such equations are supplied by a study of statistical time series.

 Exponential Smoothing Method This method is regarded as the best method of business
forecasting as compared to other methods. Exponential smoothing is a special kind of weighted
average and is found extremely useful in short-term forecasting of inventories and sales.

 Choice of a Method of Forecasting The selection of an appropriate method depends on many


factors – the context of the forecast, the relevance and availability of historical data, the degree
of accuracy desired, the time period for which forecasts are required, the cost benefit of the
forecast to the company, and the time available for making the analysis.

Effectiveness of Time Series Analysis:

The term time series analysis is used to distinguish a problem, firstly from more ordinary data analysis
problems (where there is no natural ordering of the context of individual observations), and secondly
from spatial data analysis where there is a context that observations (often) relate to geographical
locations. There are additional possibilities in the form of space-time models (often called spatial-
temporal analysis). A time series model will generally reflect the fact that observations close together
in time will be more closely related than observations further apart. In addition, time series models will
often make use of the natural one-way ordering of time so that values in a series for a given time will
be expressed as deriving in some way from past values, rather than from future values

Methods for time series analyses are often divided into two classes: frequency-domain methods and
time-domain methods. The former centre around spectral analysis and recently wavelet analysis, and
can be regarded as model-free analyses well-suited to exploratory investigations. Time-domain
methods have a model-free subset consisting of the examination of auto-correlation and cross-
correlation analysis, but it is here that partly and fully-specified time series models make their
appearance.

Merits:
i) It is an easy method of forecasting.
ii) By this method a comparative study of variations can be made.
iii) Reliable results of forecasting are obtained as this method is based on mathematical model.

Mohammed Roohul Ameen 10 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Moving Average

In statistics, a moving average, also called rolling average, rolling mean or running average, is a type of
finite impulse response filter used to analyze a set of data points by creating a series of averages of
different subsets of the full data set. A moving average is not a single number, but it is a set of
numbers, each of which is the average of the corresponding subset of a larger set of data points. A
moving average may also use unequal weights for each data value in the subset to emphasize particular
values in the subset.

A moving average is commonly used with time series data to smooth out short-term fluctuations and
highlight longer-term trends or cycles. The threshold between short-term and long-term depends on
the application, and the parameters of the moving average will be set accordingly. For example, it is
often used in technical analysis of financial data, like stock prices, returns or trading volumes. It is also
used in economics to examine gross domestic product, employment or other macroeconomic time
series. Mathematically, a moving average is a type of convolution and so it is also similar to the low-
pass filter used in signal processing. When used with non-time series data, a moving average simply
acts as a generic smoothing operation without any specific connection to time, although typically some
kind of ordering is implied.

Mohammed Roohul Ameen 11 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
4. What is definition of Statistics? What are the different characteristics of statistics? What are the
different functions of Statistics? What are the limitations of Statistics?

Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation,


and presentation of data. Statisticians improve the quality of data with the design of experiments and
survey sampling. Statistics also provides tools for prediction and forecasting using data and statistical
models. Statistics is applicable to a wide variety of academic disciplines, including natural and social
sciences, government, and business.

Characteristic of Statistics

a. Statistics Deals with aggregate of facts: Single figure cannot be analyzed.

b. Statistics are affected to a marked extent by multiplicity of causes: The statistics of yield of paddy is
the result of factors such as fertility of soil, amount of rainfall, quality of seed used, quality and quantity
of fertilizer used, etc.

c. Statistics are numerically expressed: Only numerical facts can be statistically analyzed. Therefore,
facts as ‘price decreases with increasing production’ cannot be called statistics.

d. Statistics are enumerated or estimated according to reasonable standards of accuracy: The facts
should be enumerated (collected from the field) or estimated (computed) with required degree of
accuracy. The degree of accuracy differs from purpose to purpose. In measuring the length of screws,
an accuracy upto a millimeter may be required, whereas, while measuring the heights of students in a
class, accuracy upto a centimeter is enough.

e. Statistics are collected in a systematic manner: The facts should be collected according to planned
and scientific methods. Otherwise, they are likely to be wrong and misleading.

f. Statistics are collected for a pre-determined purpose: There must be a definite purpose for collecting
facts. Eg. Movement of wholesale price of commodities.

g. Statistics are placed in relation to each other: The facts must be placed in such a way that a
comparative and analytical study becomes possible. Thus, only related facts which are arranged in
logical order can be called statistics.

Mohammed Roohul Ameen 12 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
Functions of Statistics

1. It simplifies mass data


2. It makes comparison easier
3. It brings out trends and tendencies in the data
4. It brings out hidden relations between variables.
5. Decision making process becomes easier.

Major limitations of Statistics are:

 Statistics laws are true on average. Statistics are aggregates of facts. So single observation is not
a statistics, it deals with groups and aggregates only.
 Statistical methods are best applicable on quantitative data.
 Statistical cannot be applied to heterogeneous data.
 It sufficient care is not exercised in collecting, analyzing and interpretation the data, statistical
results might be misleading.
 Some errors are possible in statistical decisions. Particularly the inferential statistics involves
certain errors. We do not know whether an error has been committed or not.
 Statistics can be misused and misinterpreted: Increasing misuse of
 Statistics has led to increasing distrust in statistics.

Mohammed Roohul Ameen 13 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
5. What are the different stages of planning a statistical survey? Describe the various methods for
collecting data in a statistical survey.

The planning stage consists of the following sequence of activities.

 Nature of the problem to be investigated should be clearly defined in an unambiguous manner.

 Objectives of investigation should be stated at the outset. Objectives could be to obtain certain
estimates or to establish a theory or to verify a existing statement to find relationship between
characteristics etc.

 The scope of investigation has to be made clear. It refers to area to be covered, identification of
units to be studied, nature of characteristics to be observed, accuracy of measurements,
analytical methods, time, cost and other resources required.

 Whether to use data collected from primary or secondary source should be determined in
advance.

 The organization of investigation is the final step in the process. It encompasses the
determination of number of investigators required, their training, supervision work needed,
funds required etc.

Collection of primary data can be done by anyone of the following methods.


1) Direct personal observation
2) Indirect oral interview
3) Information through agencies
4) Information through mailed questionnaires
5) Information through schedule filled by investigators

Mohammed Roohul Ameen 14 Roll Number:


Assignment MBA 1st Semester Subject: MB0024
6. What are the functions of classification? What are the requisites of a good classification? What is
Table and describe the usefulness of a table in mode of presentation of data?

The functions of classification are:

1) It reduce the bulk data


2) It simplifies the data and makes the data more comprehensible
3) It facilitates comparison of characteristics
4) It renders the data ready for any statistical analysis

Requisites of good classification are:


1. Unambiguous: It should not lead to any confusion
2. Exhaustive: every unit should be allotted to one and only one class
3. Mutually exclusive: There should not be any overlapping.
4. Flexibility: It should be capable of being adjusted to changing situation.
5. Suitability: It should be suitable to objectives of survey.
6. Stability: It should remain stable throughout the investigation
7. Homogeneity: Similar units are placed in the same class.
8. Revealing: Should bring out essential features of the collected data.

Table is nothing but logical listing of related data in rows and columns.
Objectives of tabulation are:-
 To simplify complex data
 To highlight important characteristics
 To present data in minimum space
 To facilitate comparison
 To bring out trends and tendencies
 To facilitate further analysis

Mohammed Roohul Ameen 15 Roll Number:


Assignment MBA 1st Semester Subject: MB0024

You might also like