You are on page 1of 8

DATA ANALYSIS METHODS

UNIT I
Introduction
Statistics is the study of the collection, analysis, interpretation, presentation and organization of data.
Importance of statistics for managers:
Modern business management is more a Science than an Art. Ever increasing global
competition mandates business managers to address uncertainty by using scientific methods and be
Objective decision makers. Forecasting, planning, organizing and decision making; some of the key
activities of a manager intend better future for the business. The only certainty about the future is its
uncertainty. Even though one cannot eliminate uncertainty, it is possible to measure uncertainty using
Statistics: manager can make informed decisions by using Statistical methods and Statistical thinking.
This calls for unraveling the power of Statistics for managers.
Broadly, knowledge of statistics helps a manager to describe the problem, identify and evaluate
alternative courses of action, estimate error, monitor processes and take appropriate corrective actions to
achieve optimum results.
Applications of Statistics for Managers
Both Descriptive and Inferential statistical methods find important place in business management. To
quote a few of the many applications across functions,
1.

A Marketing manager needs to gather and analyze a large amount of data pertaining to market
dynamics and target customers. Ideally, marketing strategy depends up on the outcomes of a Market
research, which involves statistical methods for collecting and analyzing data, application of sampling
techniques and evaluating the effect of various marketing strategies.
2.
A production manager would ideally use Statistical Process Control techniques to improve
productivity and quality. Knowledge and application of Control Charts, Sampling techniques and
Probability Distributions ensures better processes and products. This also leads to the reduction in
production cost and higher profits.
3.
A HR manager would be interested in identifying the best approach to train employees and
evaluate the impact of training. There is a need to measure attrition and understand the underlying
factors.
4.
For a Finance manager, crunching financial data and using financial techniques is an integral part
of day-to-day job. Knowledge of Statistics enhances competency and proficiency of a manager as a
researcher and therefore provides an edge.
In an era where Total Quality Management [TQM], Lean organization, Six- Sigma are some of the buzz
words, it is essential for a manager to be conversant with Statistics.
.

Data:
Data is a collection of "facts or figures from which conclusions can be drawn".

Data can take various forms, but are often numerical. As such, data can relate to an enormous variety of
aspects, for example:

the daily weight measurements of each individual in your classroom;

the number of movie rentals per month for each household in your neighbourhood;

the city's temperature (measured every hour) for a one-week period.

Other forms of data exist, such as radio signals, digitized images and laser patterns on compact discs.
Qualitative vs Quantitative
Data can be qualitative or quantitative.

Qualitative data is descriptive information (it describes something)

Quantitative data, is numerical information (numbers).

Qualitative data is a categorical measurement expressed not in terms of numbers, but rather by means of
a natural language description. In statistics, it is often used interchangeably with "categorical" data.
For example: favorite color = "blue"
height = "tall"
Although we may have categories, the categories may have a structure to them. When there is not a
natural ordering of the categories, we call these nominalcategories. Examples might be gender, race,
religion, or sport.
Quantitative data is a numerical measurement expressed not by means of a natural language description,
but rather in terms of numbers. However, not all numbers are continuous and measurable. For example,
the social security number is a number, but not something that one can add or subtract.

For example: molecule length = "450 nm"


height = "1.8 m"
Quantitative data always are associated with a scale measure.

Collection of data
Primary and Secondary Data:
Primary data are those which are collected for the first time and are always given in the form of
raw materials and originals in character. These types of data need the application of statistics
methods for the purpose of analysis and interpretation. While secondary data are those which
have already been collected by someone and have gone thought the statistical machines. They
are usually refined of the raw materials .when statistical methods are applied on primary their
shape and become secondary data.
Methods of Collection of Primary Data:
The primary data are collected by the following methods.
1. Direct personal investigation.
2. indirect personal investigation
3. Investigation thought questionnaire.
4. investigation through questionnaire in the charge if enumerator
5. Investigation through locals reports.
1. Direct Personal Investigation: According to this methods the investigator has to collect
his information himself personally form the source concerned. It means the investigator
should be are the spot where the enquiry concerned. It means the investigator should be at
the spot where the enquiry is being conducted, it is also expected that the investigator
should be very polite and courteous. Further he should acquaint himself with the
surrounding situation and must know their local customs and tradition.
Advantages:
1. the information collected by this methods is reliable and accurate
2. it is a good methods for intensive investigation
3. This method gives a satisfactory result provided the scope of inquiry is narrow.
Disadvantages:
1. this methods is not suitable for extensive inquiry

2. its required a lot of expenses and time


3. the bias on the part of investigator can damage the whole inquiry
4. sometimes the informant may be reluctant to answered the question
2. Indirect Personal Investigation: This method is used when the informants are reluctant
to give the definite information. e.g., if a government servant is asked to give the
INFORMATION:
Information is "data that have been recorded, classified, organized, related, or interpreted within a
framework so that meaning emerges".
Information, like data, can take various forms. Some examples of the different types of information that
can be derived from data include:

the number of persons in a group in each weight category (20 to 25 kg, 26 to 30 kg, etc.);

the total number of households that did not rent a movie during the last month; and

the number of days during the week where the temperature went above 20C.

Types of Data & Measurement Scales: Nominal, Ordinal, Interval and Ratio
There are four measurement scales (or types of data):
1. Nominal,
2. Ordinal,
3. Interval and
4. Ratio
Nominal
Nominal scales are used for labeling variables, without any quantitative value. Nominal scales could
simply be called labels. .

Examples of Nominal Scales

Ordinal
Ordinal scales are typically measures of non-numeric concepts like satisfaction, happiness, discomfort,
etc.
Ordinal is easy to remember because is sounds like order and thats the key to remember with
ordinal scalesit is the order that matters, but thats all you really get from these.
Advanced note: The best way to determine central tendency on a set of ordinal data is to use the mode or
median; the mean cannot be defined from an ordinal set.

Example of Ordinal Scales


The statistics which can be used with nominal scales are in the non-parametric group. The most
likely ones would be:
mode
crosstabulation - with chi-square
There are also highly sophisticated modelling techniques available for nominal data.
Interval
Interval scales are numeric scales in which we know not only the order, but also the exact differences
between the values. The classic example of an interval scale is Celsius temperature because the
difference between each value is the same. For example, the difference between 60 and 50 degrees is a
measurable 10 degrees, as is the difference between 80 and 70 degrees. Time is another good example of
an interval scale in which the increments are known, consistent, and measurable.
Ordinal data would use non-parametric statistics. These would include:

Median and mode


rank order correlation
non-parametric analysis of variance
Modelling techniques can also be used with ordinal data.
Ratio

The scale ratio of a model represents the proportional ratio of a linear dimension of the model to
the same feature of the original.
Good examples of ratio variables include height and weight.
Central tendency can be measured by mode, median, or mean; measures of dispersion, such as standard
deviation and coefficient of variation can also be calculated from ratio scales.

Presentation of data in tables and charts


Frequency Distribution
After collecting data, the first task for a researcher is to organize and simplify the
data so that it is possible to get a general overview of the results. This is the goal
of descriptive statistical techniques. One method for simplifying and organizing
data is to construct a frequency distribution.
A frequency distribution is the organization of raw data in table form, using classes
and frequency.
Formation of a Discrete frequency distribution
The frequency distribution is prepared by counting the no. of times a particular
value is repeated which is called the frequency of that class. In order to facilitate
counting prepare a column of tallies. Finally we count the no. of bars and get
frequencies.
Example: The pupils in Mr. Middleton's class take a Maths test and get scores out of
10, which are listed below. Prepare the frequency distribution table.
3

10

Formation of Continuous frequency distribution:


1) Find the range of the data: The range is the difference between the largest and
the smallest values.
R= L-S
(2) Decide the approximate number of classes: Which the data are to be grouped. There are no
hard and first rules for number of classes. Most of the cases we have

to

classes. H.A.

Sturges has given aformula for determining the approximation number of classes.

Where
Where

= Number of Classes
= Logarithm of the total number of observations

For Example: If the total number of observations is

, the number of classes would be

Or

classes approximately.

(3) Determine the approximate class interval size: The size of class interval is obtained by
dividing the range of data by number of classes and denoted by I class interval size
I= (range/No. of classes)
In case of fractional results, the next higher whole number is taken as the size of the class interval.

You might also like