You are on page 1of 16

Chapter Three

Measures of Central Tendency and Location


3.1 Objectives of Measuring Central Tendency
The most important aspect of studying the distribution of a sample measurement is the position of the
central value, that is, a representative value about which the measurements are distributed and when it is
convenient to have one figure that is representative of each group. This figure is known as the average of
the group. If the numbers of the group are arranged in order of magnitude, the averages tend to fall around
the central position in the group, so averages are called measures of central tendency. In short, any
measure intended to represent the center of data set is called a measure of central tendency.
The most important objectives of measuring central tendency are:
To determining a single value around which the other data will concentrate
To summarizing/reducing the volume of the data
To facilitating comparison within one group or between groups of data
Desirable properties of measure of central tendency
We say a measure of central tendency is best if it posses most of the following. It should:
- be simple to understand and easy to calculate/interpret,
- exist and be unique,
- be rigidly defined by mathematical formula,
- be based on all observations,
- Not be seriously affected by extreme observations,
- Have capable of further statistical analysis and/or algebraic manipulation.
3.2 The Summation Notation ()
Let a data set consists of a number of observations, represents by
denotes the number of observations in the data and

x1 , x 2 , ..., x n where n (the last subscript)

xi is the ith observation. Then the sum

x 1+ x 2 ++ x n= xi
i=1

For instance a data set consisting of six measurements 21, 13, 54, 46, 32 and 37 is represented by

x1 , x 2 , x3 , x 4 , x5

x6
and

x1
where

x
i 1

Their sum becomes

= 21,

= 13,

= 54,

21+13+59+46+32+37=208.

x1 x 2 ... x n

xi

x5

x4

x3

x2

i 1

Similarly
=
Some Properties of the Summation Notation

~1~

= 46,

x6
= 32 and

= 37.

n.c

i 1

1.

=
n

b.x
i 1

where

is a constant number.

b xi
i 1

where b is a constant number

2.
n

(a bx ) n.a b x
i

i 1

i 1

3.

where
n

(x
i 1

i 1

i 1

b
and

are constant numbers

y i ) x i y i

4.
3.3 Types of Measures of Central Tendency
Several types of averages or measures of central tendency can be defined, the most commons are
- the arithmetic mean or the mean
- the geometric mean
- the harmonic mean
- the mode
- the median
The choice of average (measure of central tendency) depends upon which best represents the property
under discussion.
3.3.1. The Arithmetic Mean (The Mean)
The arithmetic mean is defined as the sum of the measurements of the items divided by the total number of
items.
Arithmetic Mean for Ungrouped Frequency Distribution
When the data are arranged or given on the form of ungrouped frequency distribution, then the formula
for the mean is
k

f x +f x + + f k x k
X = 1 1 2 2
= i=1k
f 1+ f 2 ++ f k

f i xi
Note that

fi
i=1

Example 1: You measure the body lengths (in inches) of 10 full-term infants at birth and record the
following:
17.5
19.5 17.5 19
20
21
18
19.5 18
10.75
Compute the sample mean length of the infants for these data.
Example 2: Monthly incomes of fourth year regular students are given in the following frequency
distribution.
Monthly income (birr)

54.5

64.5

74.5

~2~

84.5

94.5

104.5

114.5

Number of students
6
Compute the mean for these data.

15

25

13

Arithmetic Mean for Grouped Frequency Distribution


If data are given in the form of continuous frequency distribution, the sample mean can be computed as
k

X =

f i xi

f 1 x 1 +f 2 x 2+ + f k x k i=1
= k
f 1+ f 2 ++ f k

fi
i=1

xi
Where

= the class mark of the i

fi

th

class; i = 1, 2, , k
th

= the frequency of the


k

class and k = the number of classes

i 1

Note that
= the total number of observations.
Example: The following table gives the daily wages of laborers. Calculate the average daily wages paid to
a laborer.
Wages in birr
Number of laborers

11-13
3

13-15
4

15-17
5

17-19
6

19-21
6

21-23
4

23-25
3

Properties of the Arithmetic Mean


The sum of the deviations of the items from their arithmetic mean is zero. This means, the algebraic

x1 , x 2 , ..., x n
sum of the deviations of a set of numbers
n

(x

x
from their mean

is zero.

x) 0

That is
The sum of the squares of the deviations of a set of observations from any number, say A, is the least
i 1

only when A= X

. That is,

( xi x )2 (xi A )

x1
When a set of observations is divided into k groups and

x2

is the mean of

xk

n2
is the mean of

n1

observations of group2, ,

observations of group 1,

nk
is the mean of

observations of group k , then

xc
the combined mean ,denoted by

, of all observations taken together is given by


k

n1 x 1+ n2 x 2 ++n k x k
X c =
= i=1k
n1 +n2 ++ nk

ni x i

ni
i=1

~3~

If a wrong figure has been used in calculating the mean, we can correct if we know the correct figure
that should have been used. Let

X wr

Xc

X wr

denote the wrong figure used in calculating the mean


be the correct figure that should have been used
be the wrong mean calculated using

X wr , then the correct mean,

X correct

, is given

by

wr + X c X wr
nX
n

X correct

x1 , x 2 , ..., x n x
If the mean of

is

, then

x1 k , x 2 k , ..., x n k
a) the mean of

will be

kx1 , kx2 , ..., kxn

xk

kx

b) The mean of
will be .
Example 1: Last year there were three sections taking Stat 273 course in a certain University. At the end
of the semester, the three sections got average marks of 80, 83 and 76. There were 28, 32 and 35 students
in each section respectively. Find the mean mark for the entire students.
Solution:

xc

n1 x1 n 2 x 2 n3 x3 28(80) 32(83) 35(76) 7556

n1 n2 n3
28 32 35
95

79.54
Example 2: An average weight of 10 students was calculated to be 65 kg, but latter, it was discovered that
one measurement was misread as 40 kg instead of 80 kg. Calculate the corrected average weight.
Solution:

X correct

wr + X c X wr 10 ( 65 ) +8040
nX
=
=69
n
10

Exercise: The average score on the mid-term examination of 25 students was 75.8 out of 100. After the
mid-term exam, however, a student whose score was 41 out of 100 dropped the course. What is the
average/mean score among the 24 students?
Weighted Arithmetic Mean
In finding arithmetic mean, all items were assumed to be of equal importance. When due importance is to
be given to each item, that is, when proper importance is required to be given to different data, then we
find weighted average. Weights are assigned to each item in proportion to its relative importance.

x1 , x 2 , ..., x k
If

w1 , w2 , ... , wk
represent values of the items and

( xw )
weighted mean,

is given by

~4~

are the corresponding weights, then the

X w =

w i xi

w1 x 1 +w 2 x 2+ +w k x k i=1
= k
w1+ w2 ++ w k

wi
i=1

Example: A students final mark in Mathematics, Physics, Chemistry and Biology are respectively 82, 80,
90 and 70.If the respective credits received for these courses are 3, 5, 3 and 1, determine the approximate
average mark the student has got for one course.
Solution: We use a weighted arithmetic mean, weight associated with each course being taken as the
number of credits received for the corresponding course.
xi 82 80 90 70

wi

xw

w x
w
i

(3 82) (5 80) (3 90) (1 70)


82.17
3 5 3 1

Therefore
Average mark of the student for one course is approximately 82.
Merits of Arithmetic Mean
- Arithmetic mean is rigidly defined a mathematical formula so that its value is always definite.
- It is calculated based on all observations.
- Arithmetic mean is simple to calculate and easy to understand. It doesnt need arraying (arranging in
increasing or decreasing order) of the data.
- Arithmetic mean is also capable of further algebraic treatment.
- It affords a good standard of comparison.
Drawbacks of Arithmetic Mean
- It is highly affected by extreme (abnormal) observations in the series. For instance, the monthly
incomes of three boys are 37 birr, 53 birr and 48 birr and that of their father is 1026 birr. The average
income become for one of these four people becomes 219 birr which is not at all a representative
figure.
- It can be a number which does not exist in the series.
- It sometime gives such results which appear almost absurd. For example it is likely that we can get an
average of 3.6 children per family.
- It gives greater importance to bigger items of a series and lesser importance to smaller items. That
means it is an upward bias measure.
- It cant be calculated for open-ended classes.
3.3.2 Geometric Mean (G.M)
The geometric mean is the nth root of the product of n positive values. If X1 , X2,,Xn are n positive
values, then their geometric mean is
G.M =(X1X2Xn)1/n .
The geometric mean is usually used in:
Average rates of change

~5~

Ratio
Percentage distribution
Logarithmical distribution.
In case of number of observation is more than two it may be tedious taking out from square
root ,in that case calculation can be simplified by taking natural logarithm with base ten
n

x1 . . . .

x1 , x2 . . . . xn

G.M =

xn n

G . M=

take log in both sides.

1
log x1 , . . . . xn
n
log ( G . M) =

1
log x1 log x2 . . . log xn
n
=

1
n

i 1

log xi

i 1

log xi

G. M = Antilog
This shows that the logarithms of G. M is the mean of the logarithms of individuals observations
Example1, The ratio of prices in 1999 to those in 2000 for 4 commodities were 0.9, 1.25,1.75 and 0.85.
Find the average price ratio by means of geometric mean.
Solution:

log X

(log 0.92 log 1.25 log 1.75 log 0.85)


4

G.M = antilog

= antilog

(0.963 1 0.0969 0.2430 0.9294 1)


4
= antilog

= antilog0.5829 = 1.14///

What is the arithmetic mean of the above values?

0.92 1.25 1.75 0.85


4
=

~6~

Note that
1.when the observed values x1,x2,.xn have the corresponding frequencies f1.f2fn
respectively then geometric mean is obtained by
n

x1 1 , x f 2 2 . . . . x f n n

G.M =

1
n

f i log xi

i 1

i 1

fi

=
where n=
2.
When ever the frequency distributions are grouped (continuous), class marks of the class
interval are considered as Xi and the above formula can be used that is
n

m1 1 , m2

f2

. . . . mn

fn

G. M =

1
n

i 1

f i log mi

i 1

where

fi
and mi is class mark if ith class.

n=

Properties of geometric mean


a. Its calculations are not as such easy.
b. It involves all observations during computation
c. It may not be defined even if a single observation is negative.
d. If the value of one observation is zero its values becomes zero.
3.3.3 Harmonic mean (H.M)
The Harmonic mean is the reciprocal of the arithmetic mean of the reciprocal of the single values. If
X1,X2, X3,,Xn are n values, then their harmonic mean is

1
1
1

...
X1 X 2
Xn

H.M =

1
i

Example
Find the harmonic mean of the values 2, 3 &6.

H.M =

3
1/ 2 1/ 3 1/ 6

3
3 2 1
6
=

3 6
6
=

~7~

= 3 ///

The harmonic mean is used to average rates rather than simple values. It is usually appropriate in
averaging kilometers per hour.
Example: A driver covers the 300km distance at an average speed of 60 km/hr makes the return trip at an
average speed of 50km/hr. What is his average speed for total distance?
Solution
Trip

Distance

Average speed

Time taken

1st

300km

60km/hr

5hrs

50km/hr

6hrs

---------

11hrs

2nd
300km
600km

Total

Total dis tan ce


Total time taken
Average speed for the whole distance=

=600km/11hrs=54.55km/hr.

Using harmonic formula

H.M=

2
1 / 60 1 / 50

=600/110=54.55km/hr.

60 50
2
Note that A.M=

=55km/hr

60 50
G.M=

=54.7km/hr

In general, A.M G.MH.M


Note that
For simple frequency data harmonic is calculated by using the following formula.

fi

xi

n
H. M = Reciprocal

n
f
xi
i
=
Properties of harmonic mean

, Where n is the total no. of observations

~8~

i.

It is based on all observation in a distribution.

ii.

Used when a situations where small weight is give for larger observation and larger
weight for smaller observation

iii.

Difficult to calculate and understand

iv. Appropriate measure of central tendency in situations where data is in ratio, speed or
rate.
3.3.4 The Median
The median of a set of items (numbers) arranged in order of magnitude (i.e. in an array form) is the middle

~
x
x
,
x
,
.
.
.,
x
n by
value or the arithmetic mean of the two middle values. We shall denote the median of 1 2
. For
ungrouped data the median is obtained by

x n 1 if the number of items, n, is odd

2
~
x 1
( x n x n 2 ) if the number of items, n, is even
2 2
2
Example 1. Find the median of the following data.
a)

3,8,4,7,7,5,6,8,7,4,6,8,9,7,6
Arrange the given data in either increasing or decreasing order
3,4,4,5,6,6,7,7,7,7,8,8,8,9
Median = 7

b)

3,4,4,5,6,6,6,7,7,7,7,8,8,8

Median=

67
6.5
2

For grouped data the median, obtained by interpolation method, is given by

n
F
2
~
X=Lmed +W
f med

( )

Lmed
Where

lower class boundary of the median class

~9~

Fp
Sum of frequencies of all class lower than the median class (in other words it is the cumulative
frequency preceding the median class)

f med

Frequency of the median class and

is class width

n
The median class is the class with the smallest cumulative frequency greater than or equal to

2
. It can be

located by counting

of the frequencies beginning from the lowest class.

Example 2. Find the median wage of the following distribution

Wages(in Rs)

2000-3000

No.of workers

3000-4000
5

4000-5000

5000-6000

20

10

6000-7000
5

Solution
Wages(in Rs)

No.of workers

cf

2000-3000

3000-4000

4000-5000

20

28

5000-6000

10

38

6000-3000

43

Here N/2 =43/2=21.5


So cf > 21.5 is 28 and the corresponding class is 4,000-5,000, so the median class is 4,000-5,000
median 4000

1000
21.5 8 4, 675
20

so the wage is 4, 675

Merits of median
- Median is a positional average and hence it is not influenced by extreme values.

~ 10 ~

- Median can be calculated even in case of open-ended intervals.


- It gives best result in a study of those phenomenas which are incapable of direct quantitative
measurement. Example: intelligence
Demerits of median
- It is not capable of further algebraic treatment.
- It is not a good representative of the data if the number of items (data) is small.
- The arrangement of items in order of magnitude is sometimes very tedious process if the number of items
is very large.
3.3.5 The Mode

x
The mode or the modal value is the most frequently occurring score/observation in a series and denoted by .
Note that the mode may not exist in the series or, even if it does exist, it may not be unique.

Example 1.a) Find the mode for the following exam result (10%) of 15 students
3,8,6,5,8,7,8,6,7,4,7,5,7,9,
The mode 7
b) 4, 5,7,8,9

there is no mode.

For grouped data, the mode is found by the following formula:

1
W
x Lmod
1 2
Lmod
Where

2
W

lower class boundary of the modal class


The difference between the frequency of the modal class and the next lower class
The difference between the frequency of the modal class and the next higher class

is the class width


The modal class is the class with the highest frequency in the distribution.

Example 2.
Find the mode for the frequency distribution given by below.
Class interval

Frequency

3-6

6-9

~ 11 ~

9-12

10

12-15

1
cw
1 2
2
2
9
3 9 3
2 7
9
29

mod e L1

.
Merits of mode
- Mode is not affected by extreme values.
- Mode can be calculated even in the case of open-end intervals. And it is not necessary to know all
observations.
Demerits of mode
- Mode may not exist in the series and if it exists it may not be a unique value.
- It does not fulfill most of the requirements of a good measure of central tendency
- It may be unrepresentative in many cases.
3.4 Measures of Location
Quantiles
Quantiles are values which divides the data set arranged in order of magnitude in to certain equal parts. They
are averages of position (non-central tendency). Some of these values of quantiles are quartiles, deciles and
percentiles.

Q1 ,Q2

Q3

I. Quartiles: are values which divide the data set in to four equal parts, denoted by
and
. The first
quartile is also called the lower quartile and the third quartile is the upper quartile. The second quartile is the
median.
For Ungrouped data:

Qj
Let

j th
be the

j
n 1
4

Qj

quartile value for j 1, 2, 3 . Then

th

item;

j 1, 2, 3.

In dividing i(n+1) by 4, there may be a remainder r ,let q be the quotient and r be the
remainder of the division then
Qi qth value

r
th
q 1 value qthvalue
4

~ 12 ~

Find the first, the second and third quartile for the following data. (exam result 10%) of 15
students 4,8,9,7,6,6,6,7,7,8,8,8,9,9,
1
15 1 th
4 value
n 1
4
4
Q1 6

3
3
n 1 15 1
4
4
th
12 value

Q1

Q3

2
1
n 1 15 1 8th value
4
2
Q2 6
Q2

For grouped data


We can apply the following formula:

j n 4 FQ j

Q j LQ j

fQj

Qj
Where

W;

j 1, 2, 3.

j th
the

quartile which is to be worked out

LQ j

FQ j

j th
Lower class boundary of the

quartile class

j th
Sum of frequencies of all classes lower than the

fQj

j
Frequency of the

th

quartile class and

quartile class

Class width

j n4

th

The

quartile class is the class with the smallest cumulative frequency greater than or equal to

j n4

be located by counting

. It can

of the frequencies beginning from the lowest class.

D1 , D2 , ..., D9
II. Deciles: are values dividing the data in to ten equal parts, denoted by
median.
For Ungrouped data
Let

Dj

j th
be the

j
n 1
10

Dj

. The fifth decile is the

decile value for j 1, 2, ... , 9 . Then

th

item;

j 1, 2, ... , 9

In dividing i(n+1) by 10, there may be a remainder r ,let q be the qoetient and r be the remainder
of the division then

~ 13 ~

Di q th value

r
th
q 1 value q th value
10

For grouped data


We can apply the following formula:

j n10 FD j

D j LD j

W ; j 1, 2, ... , 9

f Dj

Define the symbols similar way as we did in the case of quartiles.

j n 10

j th
The

decile class is the class with the smallest cumulative frequency greater than or equal to

. It can

10

be located by counting

of the frequencies beginning from the lowest class.

P1 , P2 , ... P99
III. Percentiles: are values which divide the data in to one hundred equal parts, denoted by
fiftieth percentile is the median.
For ungrouped data

Pj
Let

. The

j 1, 2, 3, ... , 99
be the percentile value for

j
n 1
100

. Then

th

Pj

j 1, 2, 3, ... , 99

item;

i n 1
In dividing

by 100, there may be remainder, let q be the quotient and

be the remainder

of the division
Then p i q th value

r
th
q 1 value q th value
100

For grouped data


We can use the following formula:

j n100 FPj

Pj LPj

f Pj

W;

j 1, 2, 3, ... , 99

Define the symbols similar way as we did in the case of quartiles.

j n 100

j th
The

percentile class is the class with the smallest cumulative frequency greater than or equal to

can be located by counting

j n 100

of the frequencies beginning from the lowest class.

~ 14 ~

. It

Interpretations

Qj

( j 25) percent

1.

is the value below which

j 1, 2, 3 ). For instance,

of the observations in the series are found (where

Q3
means the value below which 75 percent of observations in the given series

are found.
2.

Dj

( j 10) percent

Is the value below which

j 1, 2, ... , 9 ). For instance,

D4
is the value below which 40 percent of the values are found in the series.

Pj
3.

of the observations in the series are found (where

j 1, 2, 3, ... , 99

j percent
is the value below which

of the total observations are found (where

P73
example, 73 percent of the observations in a given series are below

Example 2 Calculate

i) 7th decile
ii) 90 th percentile

Monthly per

No of families

C.f

capital exp. Classes


140-150

17

17

150-160

29

46

160-170

42

88

170-180

72

160

180-190

84

244

190-200

107

351

200-210

49

400

210-220

34

434

220-230

31

465

230-240

16

481

240-250

12

493

~ 15 ~

). For

Solution
i ) for D7,

7( N 1) 7 x 494

345.8
10
10

The number 345.8 is contained in the minimum cumulative frequency 351, hence the class 190200 is the 7-th decile class
345.8 244
10
107

D7 190

Then
for p90

And
-

199.5

90( N 1) 90 x 494

444.60
100
100

The number 444.60 is contained in the minimum cu.fr.465 hence, the 90 th percentile class is
220-230

443.70 434
10
31

220 3.13 223.13

p90 220

~ 16 ~