You are on page 1of 22

Measures of Dispersion

• Prof G R C Nair
Need

• Central value alone do not give a true


picture of the distribution
• Dispersion measures how the values are
scattered around the central value.
• A low value shows they are all clustered
close to the central value.
• A high value shows they are all scattered
away from the central value
Range

• Range = Largest value – Smallest


value = (L- S)
• Simple, Easy , Quick
• Not Accurate/reliable measure, not
based on all data, influenced by extreme
values
Percentiles & Quartiles

Given any set of numerical observations,
order them according to magnitude.


The P th percentile in the ordered set is
that value below which lie P % (P percent)
of the observations in the set.


The position of the P th percentile is given
by (n+1)P/100, where n is the total number
of observations in the set.
A large department store
collects data on sales made by
each of its salespeople. The
number of sales made on a given
day by each of 20 salespeople is
shown on the next slide. Also, the
data has been sorted in
magnitude.
Sales Sorted Sales
9 6
6 9
12 10
10 12
13 13
15 14
16 14
14 15
14 16
16 16
17 16
16 17
24 17
21 18
22 18
18 19
19 20
18 21
20 22
17 24
Percentiles


To find the 90th percentile, determine the
data point in position (n+1)P/100 = (20+1)
(90/100) = 18.9.

Thus, the percentile is located at the 18.9th
position.


The 18th observation is 21, and the 19th
observation is also 22.

The 90th percentile is a point lying 0.9 of
the way from 21 to 22 and is thus 21.9.
Quartiles
Quartiles are the percentage points that break
down the ordered data set into quarters.

The first quartile is the 25th percentile. It is
the point below which lie 1/4 of the data.


The second quartile is the 50th percentile. It
is the point below which lie 1/2 of the data. This
is also called the median.


The third quartile is the 75th percentile. It is
the point below which lie 3/4 of the data.


The interquartile range is the difference
between the third and the first quartiles.
Quartile Deviation
• Quartile Deviation is (Q3-Q1)/2
• For grouped data,Q1= L1+(0.25N-C)h/f
Q3= L3+(0.75N-C)h/f, where,
• L1=Lower boundary of first quartile class
• L3=Lower boundary of third quartile class
• N= total frequency; h= Class width
• C=Cum frequency up to the lower limit of
the concerned quartile class
• f = frequency of the concerned quartile
class
Example
Sales Sorted Sales
9 6 (n+1)P/100
6 9
12 10 Position Quartiles
10 12
13 13 First Quartile (20+1) 1/4=5.25 13+(.25)(1)=13.25
15 14
16 14
14 15
14 16
16 16
17 16 Median (20+1) 1/2=10.5 16+(.5)(0) = 16
16 17
24 17
21 18
22 18
18 19 Third Quartile (20+1) 3/4=15.75 18+(.75)(1)=18.75
19 20
18 21
20 22
17 24
Range and Interquartile Range
Sorted Range Maximum-Minimum
Sales Sales Rank 24 - 6 = 18
9 6 1 Minimum
6 9 2
12 10 3
10 12 4
13 13 5 First Quartile Q1 = 13 + (.25)(1) = 13.25
15 14 6
16 14 7
14 15 8
14 16 9 Interquartile Q3 - Q1 =
16 16 10 18.75 - 13.25 = 5.5
17 16 11 Range
16 17 12
24 17 13
21 18 14
22 18 15 Third Quartile Q3 = 18+ (.75)(1) = 18.75
18 19 16
19 20 17
18 21 18 Quartile (Q3 - Q1)/2 = 2.75
20 22 19
17 24 20 Maximum
Deviation
Mean deviation

Mean deviation |x – |/N

For grouped data,


 f |x – |/N

Simple, easy, considers all data,


But less reliable as it ignores sign, not
conducive for mathematical treatment
Variance

The variance is the average of


the squared deviations from the
population mean.

All values are used in the calculation.


Not influenced by extreme values.
The units are awkward, the square of
the original units.
Variance

( X   ) 2

• Population variance is -  2

N

f ( X   ) 2
• for grouped data - 2 
N

2

( X x )
s 2

• For Sample variance- n-1

f (X  x ) 2

• for grouped data - s 


2
n -1
• The ages of a
family are:
• 2, 18, 34, 42
X 96
   24
N 4

What is the  
2 ( X  ) 2

2  24 2... 42 24 2
variance? N 4
944
  236
4
Standard Deviation

• Standard deviation is the square


root of the variance.

Find the Standard deviation for the last


problem

   2  236 15.36
Short cut Formula

[ ]
1/2
 fd 2
 fd
-{ }
2
 = ix N N
• Std deviation

• Where, d = (m - A) / i
• i = Class interval
• m = mid value of class
• A = Assumed Mean
Example HW
ans -hidden

• A factory produces bulbs, whose length of life


was found to be as given in the table below.
Find the mean life and the std deviation by
normal and by short cut method.
• Life No of lamps
• 500-700 5
• 700-900 11
• 900-1100 26
• 1100-1300 10
• 1300-1500 8
• Mean =1016.67
• Assumed mean = 1000
• X d d2 f fd fd2
• 600 -2 4 5 -10 20
• 800 -1 1 11 -11 11
• 1000 0 0 26 0 0
• 1200 1 1 10 10 10
• 1400 2 4 8 16 32
 60 5 73
Std deviation = 200{( 73/60) - (5/60)2 }1/2
= 219.8
Coefficient of
Variation

• The coefficient of
variation is the
ratio of the
standard deviation  (100%)
to the arithmetic CV  
mean, expressed
as a percentage:
Combined Values

• Combined Mean of 2 groups


 = 1N1+ 2 N2
N1+N2
Combined Variance
 2 = N1 12 + N2 22 + N1d12 + N2d22
N1+N2
where, d1= 
d2 = 
Chebyshev’s Rule

• Irrespective of the shape of the


distribution curve, at least 75 % of
values will fall between +/- 2 and
89% within +/- 3 from the mean.
• % age data with in +/- k times ‘s’ of
the mean will be at least (1-1/k2)x
100

You might also like