You are on page 1of 18

Central Tendency & Variability

A S 12.4.3

Did you hear about the statistician who put her head in the oven and her feet in the refrigerator? She said, "On average, I feel just fine."

Which average?
All three averages are useful for summarizing e.g. the distribution of household incomes.

In 1998, the income common to the greatest number of households (mode) was R25 000. Half the households (median) earned less than R38 885. Reporting only one measure of central The mean income was R50 and tendency might be misleading 600. perhaps reflect a bias.

height (cm)midpoint (x) frequency (f) 150- <155 155- <160 160- <165 165- <170 152,5 157,5 167,5 172,5 162,5 177,5 4 7 18 11

The table shows the heights of 50 randomly chosen Grade 12 school girls.
(f)(x) 610,0 1102,5 2925,0 1842,5 1035,0 710,0 8225,0

Complete the table Mean height =

Calculate the mean height6of the girls? 170- <175


175- <180
82 25

/50 = 164,5 cm

height (cm) midpoint 150- <155 155- <160 160- <165 165- <170 170- <175 152,5 157,5 162,5 167,5

frequency (f) 4 7 18 11 6

11

21

Find the mode: between 160 - 165 cm


172,5

Calculate 177,5 175- <180 the median:

Median is the mean of 25th and 26th girl. Both 4 25th and 26th midpoint is 162,5.

Measures of Variability
It is a single summary figure that describes the spread of data within a distribution. Range difference between the smallest and largest observations. Percentiles where p% of the values falls below a certain value. Interquartile Range (IQR) - Range of the middle half of median scores.

The results of a survey of the travelling time (in minutes) of 200 workers are asFreq Cum Time Complete the follows. freq (min) (f) cumulative 0<x10 28 28 frequency table.
10<x20 37 20<x30 55 30<x40 44 65 120 164 186

Draw a smooth cumulative frequency curve.

Plot these 40<x50 22

numbers

200 Use the graph 180 to estimate: 160 (150 140 the median Q2= 27 ) 120 100 the interquartile range 80 60 Q3=37; Q1= 16 40 IQR = 37 - 16 (100) 20 =21 0 0 10 20 30 40 50 60

Cum freq

(50)

Time (min)

the Q1 Q Q no. of workers 2 3 with

no. = 200 192 = 8 workers

value, first quartile (Q1), median (Q2), third quartile (Q3) and highest data value. Unathi sells the following number of computers in 12 months: 34, 47, 1, 15, 57, 24, 20, 11, 19, 50, 28, 37 Arrange the data in ascending order. 1, 11, 15, 19, 20, 24, 28, 34, 37, 47, 50, 57

Five-number summary The summary consists of the lowest data

Give a five-number summary of the sales.

1, 11, 15, 19, 20, 24, 28, 34, 37, 47, 50, 57 Q1 Minimum = 1 Median = Q2 = Q1 =
(1 +9 5 1) (2 +8 4 2)

median

Q3 Maximum = 57

/2 = 26
(3 +7 7 4)

/2 = 17

Q3 =

/2 = 42

5-number summary = 1; 17; 26; 42; 57

Barry also sells computers during a 12 month period. Below is a 5-number summary for each person. Unathi min 1 Q1 17 Q2 Q3 26 42 Barry 6 15 32 46 Which person would you most likely want to appoint for your company? Barry Explain.

max 57 62 Barrys highest and lowest sales are higher than Unathis corresponding sales, and Barrys median sales figure is also higher than Unathis.

Percentiles
A percentile is a score below which a certain percentage of values fall. There are 100 percentiles in a sample.

e.g. If your test score is in the 95th Oscars height is at the 90th percentile and percentile, at the 60th percentile his weight isit means that if 1000 for his students took the test, at least 950 age. students did worse than build in at Describe Oscar's physical you and general most terms. 49 students did better than you. He is taller thanmean of the It does not 90% that people but only weighsfor you received 95% more thanthe test.the people 60% of possibly tall and thin.

In 2004 the snow depth at Tiffendell was measured (in mm) for 25 days and recorded. 242, 228, 217, 209, 253, 239, 266, 242, 251, 240, 223, 219, 246, 260, 258, 225, 234, 230, 249, 245, 254, 243, 235, 231, 257. Depth (mm) freq cum freq
200- 210- 220- 230- 240- 250- 260210 220 230 240 250 260 270 1 1 4 2 3 12 3 6 24 5 1 1 4 4 7 1 8 7 2 5 23 92 2 2 5 100

Complete the frequency table, cum % cumulative frequency table and

Plot the graph of snow depth against the cumulative frequency and cumulative % using two different vertical axes. 25 100 percentiles
20 80

Cum freq

Cum %

15 10 5

60 40 20

0 0 200 210 220 230 240 250 260 270

Estimate the 80th percentil e. 252mm

Depth (mm)

25 20

100 80 60 40 20

Cum %

15 10 5

For how many days was the depth at least 250mm? 25-18 = 7 days

Cum freq

0 0 200 210 220 230 240 250 260 270

Depth (mm) A year later (2005) the depth is shown by the broken line. Explain which year had possibly better 2004 - depth greater on more days. E.g. 44% below 240mm compared to 80% below 240mm in

Standard Deviation
Standard deviation is useful when comparing the spread of two or more data sets that have approximately the same mean. This technique is best used with symmetric distributions with no outliers. The smaller the standard deviation the narrower the spread of measurements around the mean, as it has possibly few high or low values. e.g. If the mean of a data set is 5 and the S.D. is 2, then on average the data lies between 3 and 7.

Super Crisps come in 25g bags. There are two machines (A & B) producing the chips. A 25,6 24,8 25,7 engineer weighs a A 25,3quality control25,5 25 24,9 25,7 25,5 25,6 sample of 25,3 25,4 24,9 each machine. B 25,3 10 bags from 25,3 25,3 25,4 25,4 25,4 25,3 Calculate the mean of each machine. For A: mean = 253,6 /10 = 25,36g For B: mean = 253 /10 = 25,3g

Below is the variance and standard deviation for each machine. Mean Variance S.D.
A B 25,36g 25,3g 1,044 0,2 1,02 0,45

Super Crisps will be taken to court if it is found their bags are less than 25g. Which machine gives the best chance of avoiding this fate? Machine B Explain. A: Mass of chips are on average from 24,34g to 26,38g. B: Mass of chips are from 24,85g to 25,75g. B has a narrower spread than A. Its smallest value is very

You might also like