Central Tendency

Brief lecture notes
Measures of Central Tendency (Location)
Central Tendency or measures of central location

In a representative sample, the values of a series of data have a tendency to
cluster around a certain point usually at the canter of the series. This
tendency of clustering the values around the center of the series is usually
called central tendency.
And its numerical measures are called the measures of central location.
That is, it is a single value that represents a set of data. It pinpoints the
center of the values.
Characteristics of Ideal measures of Location

It should be –
Rigidly defined
Readily comprehensible and easy to calculate
Based upon all the observations.
Suitable for further algebraic treatments
Affected as small as possible by sampling fluctuation.
Different Measures of Central Location

There are five different measures of central location:
Arithmetic mean or Mean
Geometric mean
Harmonic mean
Median
Mode
1
 (Arithmetic, Geometric & Harmonic mean) simply as mean or
average for ungrouped and grouped data.
For Ungrouped data or raw data
The arithmetic mean or mean of a series of observations is equal to the

sum of the observations divided by their number. So, for ungrouped data,
the population mean is
Sum of all the values in the population
Population Mean =
Number of values in the population
If there are N observations in the population data set, the mean is

calculated as:
X 1  X 2  ...  X N
Population mean  
N
N
 Xi
i 1

N
Parameter: Any measure based on population data is called a parameter.
For raw i.e. for ungrouped data, the mean for a sample is
Sum of all the values in the sample
Sample mean =
Number of values in the sample
x
x
n
Statistic: Any measure based on sample data is called a statistic.
Table-1: gives the 2002 total payrolls of five Major League Baseball
(MLB) teams. Find the mean of the 2002 payrolls of these five MLB
teams.
2
Table-1
2002 Total Payroll

MLB Team (millions of dollars)
Anaheim Angels 62
Atlanta Braves 93
New York Yankees 126
St. Louis Cardinals 75
Tampa Bay Devil Rays 34
Solution:
x
 x 390
  $78 million
n 5
Thus, the mean 2002 payroll of these five MLB teams was $78 million.
Problem: The following are the ages of all eight employees of a small
company: 53 32 61 27 39 44 49 57
Find the mean (population) age of these employees.
Solution:

 X 362
  45.25 years
N 8
Thus, the mean age of all eight employees of this company is 45.25 years,
or 45 years and 3 months.
3
Mean for Grouped data or Frequency distribution
When the data are arranged or given in the form of frequency distribution
i.e. there are K variate values such that a value X i (mid-values of a class)
has a frequency f i (i  1, 2, ..., K ), the formula for the mean is
f1 X 1  f 2 X 2  ...  f K X K
Mean. 
f1  f 2  ...  f K
K
 fi X i
i 1
 K
 fi
i 1
K
 fi X i
i 1
 , N  f 1  f 2  ...  f K
N
Problem: Calculate the mean of the data of days to maturity of 40 short-
term investments.
Class Interval 30-39 40-49 50-59 60-69 70-79 80-89 90-99
No. of Investments 3 1 8 10 7 7 4
Solution:
Calculation of a mean from data of days to maturity 40 short-term
investments:
Class interval Midpoint ( xi ) Frequency ( f i ) f i xi
30—39 34.5 3 103.5
40—49 44.5 1 44.5
50—59 54.5 8 436
60—69 64.5 10 645
70—79 74.5 7 521.5
80—89 84.5 7 591.5
90—99 94.5 4 378
Total 40 2720
4
k
 f i xi
i 1 2720
x =  68.00
n 40
Outliers or extreme values

Values that are very small or very large relative to the majority of the
values in a data set are called outliers or extreme values.
Example
Table represents lists the 2000 populations
(in thousands) of the five Pacific states.
Table 3.2
Population
State (thousands)
Washington 5894
Oregon 3421
Alaska 627
Hawaii 1212
California 33,872 An outlier
11
Notice that the population of California is very large compared to the

populations of the other four states. Hence, it is an outlier.
Merits or advantage of mean

Rigidly defined
Readily comprehensible and easy to calculate
Based upon all the observations.
Suitable for further algebraic treatments
Affected as small as possible by sampling fluctuation.
Limitations or disadvantages:
The mean is excessively affected by extreme values
5
It is unrealistic
It may lead to a false conclusion
Though it has limitations, mean is called the ideal measures of locations. It
is used in social, economic and business problems.
Median for ungrouped data:
The median is the value of the middle term in a data set that has been
ranked in increasing order.
Let X 1 , X 2 , ..., X N be N ordered observations.
N 1
Then th observation will be the median.
2
Example
The following data give the weight lost (in pounds) by a sample of five
members of a health club at the end of two months of membership:
10 5 19 8 3
Find the median.
Solution
First, we rank the given data in increasing order as follows:
3 5 8 10 19
There are five observations in the data set. Consequently, N= 5 odd
number,
5 1
The position of the median  3
2
Therefore, the median is the value of the third term in the ranked data.
3 5 8 10 19
6
The median weight loss for this sample of five members of this health club
is 8 pounds.
Table-Me. lists the total revenue for the 12 top-grossing North American
concert tours of all time. Find the median revenue for these data.
Table-Me
Total Revenue
Tour Artist (millions of dollars)
Steel Wheels, 1989 The Rolling Stones 98.0
Magic Summer, 1990 New Kids on the Block 74.1
Voodoo Lounge, 1994 The Rolling Stones 121.2
The Division Bell, 1994 Pink Floyd 103.5
Hell Freezes Over, 1994 The Eagles 79.4
Bridges to Babylon, 1997 The Rolling Stones 89.3
Popmart, 1997 U2 79.9
Twenty-Four Seven, 2000 Tina Turner 80.2
No Strings Attached, 2000 ‘N-Sync 76.4
Elevation, 2001 U2 109.7
Popodyssey, 2001 ‘N-Sync 86.8
Black and Blue, 2001 The Backstreet Boys 82.1 22
Solution
First we rank the given data in increasing order, as follows:
74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2
There are 12 values in this data set. Hence, N = 12 and
N  1 12  1 13
The position of the median=    6.5 th position
2 2 2
Therefore, the median is given by the mean of the sixth and the seventh
values in the ranked data.
74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2
82.1  86.8
Median   84.45
2
Thus the median revenue for the 12 top-grossing North American concert
tours of all time is $84.45 million.
7
For Grouped Data
For grouped data, the median is
N
 cf
Median  L  2 C
fm
where L  Lower limit of the median class

N  Total No. of observations
cf  Cumulative frequency for the class just preceding the median class
f m  Frequency of the median class

C  Class interval of the median class
Example: Calculate the median from following Table
Marks No. of Students(f)
10-25 6
25-40 20
40-55 44
55-70 26
70-85 3
85-100 1
Solution:
Marks (x) No. of Students (f) cf

10-25 6 6
25-40 20 26
40-55 44 70
55-70 26 96
70-85 3 99
85-100 1 100
8
N 100
Median Item=  =50, so median item lies in marks 40-55
2 2
Therefore,
N
 cf
Median  L  2 C
fm
N 100
L=40,  =50, c.f=26, fm=44, C=15
2 2
50  26
Median  40   15
44
=40+8.18
=48.18
Therefore median marks is 48.18
Mode
The mode is the value that occurs with the highest frequency in a data set.
The following data give the speeds (in miles per hour) of eight cars that
were stopped on I-95 for speeding violations.
77 69 74 81 71 68 74 73
Find the mode.
In this data set, 74 occur twice and each of the remaining values occurs
only once. Because 74 occur with the highest frequency, it is the mode.
Therefore,
Mode = 74 miles per hour
A data set may have none or many modes, whereas it will have only
one mean and only one median.
The data set with only one mode is called unimodal.
The data set with two modes is called bimodal.
The data set with more than two modes is called
multimodal.
9
Example
Last year’s incomes of five randomly selected families were $36,150.
$95,750, $54,985, $77,490, and $23,740
Find the mode.
Because each value in this data set occurs only once, this data set
contains no mode.
Example
The prices of the same brand of television set at eight stores are found
to be $495, $486, $503, $495, $470, $505, $470 and $499.
Find the mode.
In this data set, each of the two values $495 and $470 occurs twice and
each of the remaining values occurs only once.
Therefore, this data set has two modes: $495 and $470.
*One advantage of the mode is that it can be calculated for both kinds
of data, quantitative and qualitative, whereas the mean and median
can be calculated for only quantitative data.
Example
The statuses of five students who are members of the student senate at
a college are senior, sophomore, and senior, junior, senior.
Find the mode.
Because senior occurs more frequently than the other categories, it is
the mode for this data set.
We cannot calculate the mean and median for this data set.
10
For grouped Data
For grouped frequency distribution the mode is given by:
1
M0  L  C
1   2
Where, L=the lower limit of the modal class (modal class is the class
for which the frequency is maximum).
 1 =The difference between the frequency of the modal class and pre-
modal class.
 2 =The difference between the frequency of the modal class and post-
modal class.
C= the length of the modal class.
Example
Calculate the mode from following Table
10-25 6
25-40 20
40-55 44
55-70 26
70-85 3
85-100 1
Solution
Here the modal class is 40-55 because in that class the frequency is
maximum i.e. 44
Therefore,
11
1
M0  L  C
1   2
Where, L=40,  1 =44-20=22,  2 =44-26=18, C= 5
22 22
So, M 0  40   5  40   5  42.75
22  18 40
 The relationships among mean, median and mode.
Figure Mean, median, and mode for a symmetric

histogram and frequency curve.
40
For symmetric distribution, mean, median and mode are same value.
For right skewed, mean is highest and mode is lowest and for left
skewed mode is largest and mean is smallest.
12
Assignment on Central Tendency
1. Calculate the mean, median and the mode for the following frequency
distribution and make comments.
Monthly Income No. of clerks
300-325 5
325-350 17
350-375 80
375-400 227
400-425 326
425-450 248
450-475 88
475-500 9
2. Calculate the mean, median and mode for the data given below:
Daily Earnings (Tk.) No. of Persons
50-53 3
53-56 8
56-59 14
59-62 30
62-65 36
65-68 28
68-71 16
71-74 10
74-77 5
3. Calculate the mean of the data of days to maturity of 40 short-term

investments.
Class Interval 30-39 40-49 50-59 60-69 70-79 80-89 90-99
No. of Investments 3 1 8 10 7 7 4
4. Calculate the median and mode from following Table

10-25 6
25-40 20
40-55 44
55-70 26
70-85 3
85-100 1
13

Central Tendency

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Central Tendency

Uploaded by

Copyright:

Available Formats

Brief lecture notes

Measures of Central Tendency (Location)

Central Tendency or measures of central location

Characteristics of Ideal measures of Location

Different Measures of Central Location

For Ungrouped data or raw data

The arithmetic mean or mean of a series of observations is equal to the

If there are N observations in the population data set, the mean is

2002 Total Payroll

Outliers or extreme values

Notice that the population of California is very large compared to the

Merits or advantage of mean

Median for ungrouped data:

Let X 1 , X 2 , ..., X N be N ordered observations.

where L  Lower limit of the median class

f m  Frequency of the median class

Marks (x) No. of Students (f) cf

Where, L=40,  1 =44-20=22,  2 =44-26=18, C= 5

 The relationships among mean, median and mode.

Figure Mean, median, and mode for a symmetric

3. Calculate the mean of the data of days to maturity of 40 short-term

4. Calculate the median and mode from following Table

You might also like