You are on page 1of 43

Managing Information

Part 2: Statistics

KUSOM | Managing Information | PGDM

Learning Objectives

To use summary statistics to describe collections data

To understand summary statistics like central tendency , Dispersion and skewness and Kurtosis To use the Mean, Median, Mode to describe how data bunch up To use the range, variance and standard deviation to describe how data spread out To explore computer based software ( SPSS ) to analyze the data and to see other useful ways to summarized data.

KUSOM | Managing Information | PGDM

Standard Notation
Measure Sample X Population
2

Mean Stand. Dev.


Variance

S
S2 n

Size

KUSOM | Managing Information | PGDM

Numerical Data Properties & Measures


Numerical Data Properties
Central Tendency
Mean Median Mode

Variation
Range Varianc

Shape
Skew

e Standard Deviation

Arithmetic Mean

Refer to Average of something Example:


Average Winter Temperature of KTM Average life of a Flashlight battery

Arithmetic Mean = 7+23+4+8+2+12+6+13+9+4


10 =8.8 days

Single Measure of all the behavior of Data.

KUSOM | Managing Information | PGDM

Arithmetic Mean

Ungroup Datas Mean Formula:

KUSOM | Managing Information | PGDM

Mean Calculation from Grouped Data

Frequency Distribution Data Grouped by Classes. Each Observation fall somewhere in one classes.

What to do to find A.M of group Data? - Calculate Mid point of each class
-

Round up to make mid point to whole number. Multiply the Mid point by Frequency in that class
Sum all the results Divide the total number of observations in sample [ Sum of Frequency]

KUSOM | Managing Information | PGDM

Mean Calculation from Grouped Data

Where,

x = sample mean f = frequency in each class


x = midpoint for each class in the sample n = number of observations in the sample / sum of frequency
KUSOM | Managing Information | PGDM

Mean Calculation from Grouped Data


Calculate

the Arithmetic Mean from Grouped

data

KUSOM | Managing Information | PGDM

Mean Calculation from Grouped Data

KUSOM | Managing Information | PGDM

Arithmetic Mean of Grouped Data Using Coding

Simplify calculation of mean for grouped data Eliminate problem of large and inconvenient midpoints

Assign small value whole numbers(consecutive integers) codes to each of the midpoints. Assign zero to the middle midpoint or the one nearest to the middle of frequency distribution. Negative integer to the value smaller than midpoint and Positive for larger.

KUSOM | Managing Information | PGDM

Arithmetic Mean of Grouped Data Using Coding

KUSOM | Managing Information | PGDM

Example:

KUSOM | Managing Information | PGDM

Disadvantages of Arithmetic Mean

Affected by extreme values Unable to compute mean for the dataset that has open end classes
Class in minutes 4.2-4.5 4.6-4.9 5.0-5.3 5.4-Above

Frequency

Tedious to compute: use every data point in calculation. Ex: if there is 600 data point?

KUSOM | Managing Information | PGDM

Weighted Mean

KUSOM | Managing Information | PGDM

Example:

Q. Bob goes the Buy the Weigh Nut store and creates his own bridge mix. He combines 1 pound of raisins, 2 pounds of chocolate covered peanuts, and 1.5 pounds of cashews. The raisins cost $1.25 per pound, the chocolate covered peanuts cost $3.25 per pound, and the cashews cost $5.40 per pound. What is the cost per pound of this mix?.
KUSOM | Managing Information | PGDM

Geometric Mean

Why? To measure the mean of data that change over period of time, like growth rate over period of time.

KUSOM | Managing Information | PGDM

Geometric Mean

Growth Factor: A.M of Growth factor: (1.07 + 1.08 + 1.10 + 1.12 + 1.18)/5 =1.11

Means 11 percent interest rate per year If bank give interest at constant interest Rate of 11 percent for 5 years then $100 x 1.11 x1.11 x 1.11 x 1.11 x 1.11 = 168.51 Differs from 168 in previous table.

KUSOM | Managing Information | PGDM

Geometric Mean

Correct mean is 10.93 percent per year Which is close to 11 percent ( incorrect mean)
In some situation A.M and G.M are very close, though small difference can lead to poor decision.

KUSOM | Managing Information | PGDM

Geometric Mean
In highly inflationary economies, banks must pay high interest rates to attract savings. Suppose that over 5 years in an unbelievably inflationary economy banks pay interest at annual rates of 100,200, 250,300 and 400 percent which correspond the growth factor of 2,3,3.5,4 and 5. The initial deposit of $100 for 5 years. Calculate A,M and G.M? Find the error ?

KUSOM | Managing Information | PGDM

Median

KUSOM | Managing Information | PGDM

Median : Example
Class in $ 0-49.99 Frequency 78 Cumulative Frequency

50.00-99.99 100.00-149.99
150.00-199.99 200.00-249.99 250.00-299.99 300.00-349.99 350.00-399.99 400.00-449.99 450.00-499.99 TOTAL

123 187
82 51 47 13 9 6 4 600

78 201 388 470 521 568 581 590 596 600

=(600+1)/ 2 =300.5 = 301

Solution: =$126.35

KUSOM | Managing Information | PGDM

Median : Pros and Cons


Pros: Extreme values doesnt affect the median

Calculated from group data with open ended classes, unless median fall in open end classes Calculated even when data are qualitative desc, like color, sharpness

Cons: Time consuming for large dataset. Tips: Use common sense to select the statistical tool
KUSOM | Managing Information | PGDM

Mode

Value that is repeated most often in the dataset Risk in mode of ungroup data.

Delivery trips per day made by Redix concrete plant in 20 days period. Mean : 6.7 Mode: 15

Frequency distribution of daily trip

KUSOM | Managing Information | PGDM

Mode

Group data in frequency distribution Assume the mode in the class with most items ( i.e. class with highest frequency) Use below formula to get single value from modal class:

KUSOM | Managing Information | PGDM

Mode: Example

4-7 is a modal class: 8 is highest frequency

Then, Lmo = 4;d1 = 8-1=7; d2= 8-6=2; w= 3 Mo = 6.33 is the estimate of the mode
There can be more than two modes in data set. Bimodal ( Two modes)

KUSOM | Managing Information | PGDM

Mode : Pros and Cons


Pros: Central location for qualitative and quantitative data.

Not affected by extreme values. Can use for open ended classes.

Cons: Useless when no mode and data is repeated same in all class

Contains more than 1 modes difficult to interpret.

KUSOM | Managing Information | PGDM

HINTS

If you are averaging a small group of factory wages fairly near each other, A.M is accurate & Fast. If there are 500 new houses in a development all within $10,000 of each other in value, and data is skewed: Median much quicker and accurate. Effect of inflation and interest requires G.M. Common sense: Although average children is 1.65 but children park manager will make better decision taking modal value 2 kids.

KUSOM | Managing Information | PGDM

Dispersion: Why important?

Reliability of Central location: in widely disperse data, mean is not much reliable as in Curve C.

Widely disperse data contain more problems, Identify it early before further analysis

KUSOM | Managing Information | PGDM

Dispersion: Useful Measures


Dispersion Measures Interfractile range InterQuartile range

Range

KUSOM | Managing Information | PGDM

Dispersion : Range
Range = Value of highest - Value of Lowest observation observation

Ignores the nature of variation other than highest and lowest observation Heavily influenced by extreme values The range of sample of population like to vary widely because it only focus on highest and lowest values of population. Open-ended classes has no range: no highest and lowest values.

KUSOM | Managing Information | PGDM

Interfractile Range

In Frequency distribution, a given fraction or proportion of data lies above or below fractile. Interfractile Range is measure of the spread between two fractiles in frequency distribution.

i.e difference between two fractiles. Calculate interfractile of


1/3 and 2/3 fractile by Subracting 1,041 and 1,624

KUSOM | Managing Information | PGDM

Type of Fractiles
Deciles: 10 equal parts Quartiles: 4 equal parts Percentiles: 100 equal parts

KUSOM | Managing Information | PGDM

Interquartile Range : Measure variability

Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; Q1, Q2, and Q3, resp.

Q1 is the "middle" value in the first half of the rankordered data set. Q2 is the median value in the set.
Q3 is the "middle" value in the second half of the rank-ordered data set. Interquartile Range = Q3 Q1

KUSOM | Managing Information | PGDM

Deviation Measure : Variance


Population Variance:

KUSOM | Managing Information | PGDM

Deviation Measure : Standard Deviation

KUSOM | Managing Information | PGDM

Variance and Standard Deviation Using Group Data

KUSOM | Managing Information | PGDM

Example:

KUSOM | Managing Information | PGDM

Example:

KUSOM | Managing Information | PGDM

Sample Variance and Deviation

KUSOM | Managing Information | PGDM

Chebyshevs theorem

No matter what the shape of distribution at least 75% of the values will fall within +/- 2 standard deviations from the mean fo the distribution, and at least 89% of the values will lie with in +/- standard deviations from the mean.
Bell shape frequency distribution curve

KUSOM | Managing Information | PGDM

Coefficient of Variation

Is the relative major of the dispersion.

Independent of the unit in which the measurement has been taken. Expressed as a percent For comparison between data sets with different units or widely different means

KUSOM | Managing Information | PGDM

Chapter Review

Dispersion Skweness Mean , Median , Mode Application

KUSOM | Managing Information | PGDM

You might also like