You are on page 1of 4

Data Presentation and Descriptive Statistics

A manufacturer is investigating the operating life of laptop computer batteries. The following
data are available.
Life (min.) Life (min.) Life (min.) Life (min.)
130 145 126 146
164 130 132 152
145 129 133 155
140 127 139 137
131 126 145 148
125 132 126 126
126 135 131 129
147 136 129 136
156 146 130 146
132 142 132 132

Using the first two digits as stem we may develop the following plot:
Freq.
12 5 6 9 7 6 6 6 9 6 9 10
13 0 1 2 0 2 5 6 2 3 9 1 0 2 7 6 2 16
14 5 0 7 5 6 2 5 6 8 6 10
15 6 2 5 3
16 4 1
Stem-and-leaf plot

The plot shows that most of the data is clustered around 130, with few data points
crossing the 150 limit. One may conclude that the center of the data is somewhere
in the 130s. Variation is harder to judge. Whether the variability is high or low can
only be determined on a comparative basis at this stage. If another data set is
available (may be for another brand), a back-to-back stem-and-leaf plot could be
used to visually compare the variability in both sets.

By ordering the leafs, we get the following plot:

Freq
12 5666667999 10
13 0 0 0 1 1 2 2 2 2 2 3 5 6 6 7 9 16
14 0255566678 10
15 256 3
16 4 1
Ordered Stem-and-leaf plot
From the plot above, we may determine many measures of dispersion and central
tendency:

Minimum = 125, Maximum = 164, Range = 164 – 125 = 39.

Mode = 126, 132 (both are repeated 5 times- Bimodal data)

( x  x ) (132  133)
Median( ~
x )  [ 20] [ 21]  132.5.
2 2
Other measures require some calculations:
40

x i
(130  164  ... 146  132)
Average ( x )  i 1
 136.85.
40 40

These results confirm our initial conclusion that the center is in the 130s.
40

 (x  x) i
2
(130  136 .85) 2  ...(132 136 .85) 2
Variance( S ) 
2 i 1
  95.87.
39 39
s  S  9.79.
2
Now, let us assume that another data set of 40 points is available for another brand of batteries
(Battery 2).

Life (min.) Life (min.) Life (min.) Life (min.)


134 130 140 151
143 134 136 144
150 135 160 141
143 140 138 141
148 146 140 146
151 138 151 139
151 128 146 147
152 142 144 134
142 146 142 136
122 134 145 147

The measures of center and dispersion for Battery 2 are:

Minimum = 122, Maximum = 161, Range = 161 – 122 = 39.


Mode = 134, 146, 151 (all repeated 4 times- Multi-modal data)
( x  x ) (142  142)
Median( ~
x )  [ 20] [ 21]  142.
2 2
40

x i
Average ( x )  i 1
142. Symmetric data (Average = Median).
40

40

 (x  x) i
2

Variance( S 2 )  i 1
 55.2.
39
s  S 2  7.43.

These results show numerically that Battery 2 has a higher average life with slightly less
variation. An easy way to graphically compare the two sets is to develop a back-to-back stem-
and-leaf plot.

Freq Battery 2 Battery 1 Freq


2 82 12 5666667999 10
11 98866544440 13 0001122222356679 16
20 87766665443322211000 14 0255566678 10
6 211110 15 256 3
1 0 16 4 1
Back-to-Back Stem-and-Leaf Plot

The plot above shows that more data for Battery 2 are in the 140s compared to the 130s for
Battery 1. Also, the spread (variability) of Battery 2 is less than that of Battery 1. Based on these
results, we may conclude that Battery 2 is a better brand (higher average and lower variability).
The validity of this conclusion, however, depends on how data are collected and the sufficiency
of n. These issues are typically discussed as part of Inferential Statistics and Design of
Experiments.

A better graphical comparison tool is the box (box-and-whisker) plot. A plot for both data sets is
shown below.

Box Plot

The plot above supports our previous conclusion as the interquartile range of Battery 2 is shorter
than that of Battery 1 (less variability), and is shifted to the right (higher center).

You might also like