You are on page 1of 77

Statistical literacy guide 1

How to adjust for inflation


Last updated: February 2009
Author: Gavin Thompson

This note uses worked examples to show how to express sums of money taking into account
the effect of inflation. It also provides background on how three popular measures of inflation
(the GDP deflator, the Consumer Price Index, and the Retail Price Index) are calculated.

1.1 Inflation and the relationship between real and nominal amounts
Inflation is a measure of the general change in the price of goods. If the level of inflation is
positive then prices are rising, and if it is negative then prices are falling. Inflation is therefore
a step removed from price levels and it is crucial to distinguish between changes in the level
of inflation and changes in prices. If inflation falls, but remains positive, then this means that
prices are still rising, just at a slower rate 2 . Prices are falling if and only if inflation is
negative. Falling prices/negative inflation is known as deflation. Inflation thus measures the
rate of change in prices, but tells us nothing about absolute price levels.

To illustrate, the charts opposite show the same


underlying data about prices, but the first gives the Inflation B

level of inflation (percentage changes in prices) A

while the second gives an absolute price index. This


0 C
shows at point:
E
D
A -prices are rising and inflation is positive
B -the level of inflation has increased, prices are
Prices
rising faster
B
C -inflation has fallen, but is still positive so prices C

are rising at a slower rate A


D
E
D -prices are falling and inflation is negative 100
E -inflation has increased, but is still negative (there
is still deflation), prices are falling at a slower rate
(the level of deflation has fallen)

The corollary of rising prices is a fall in the value of money, and expressing currency in real
terms simply takes account of this fact. £100 in 1959 is nominally the same as £100 in 2009,
but in real terms the £100 in 1959 is worth more because of the inflation over this period. Of
course, if inflation is zero, then nominal and real amounts are the same.

1.2 Using price indices to calculate inflation rates and express figures in real terms
We can use price indices to adjust for inflation and present financial data in real terms
(constant prices) or calculate real rather than nominal changes. The guide on index numbers
gives an introduction to indices and briefly explains what they mean.

1
All statistical literacy guides are available on the Library Intranet pages:
2
Conversely, a low, negative (less than zero) inflation rate implies they are falling more slowly, and a highly
negative rate implies they are falling more quickly. Negative inflation is often referred to as deflation.

1
Consecutive years
The example below uses the HM Treasury GDP HMT GDP Deflator figures
deflator index. The table opposite shows this Year Index Inflation rate
index and the corresponding year-on-year
inflation rate. 1999-00 81.976
2000-01 83.051 1.31
2001-02 84.903 2.23
The inflation rate is the percentage change in the 2002-03 87.640 3.22
index from one year to the next. For example, 2003-04 90.138 2.85
inflation between 1999-00 and 2000-01 was
2004-05 92.589 2.72
2005-06 94.485 2.05
⎡⎛ 83.051 ⎞ ⎤ 2006-07 97.030 2.69
⎢⎜ 81.976 ⎟ − 1⎥ = 1.31%
3
2007-08 100.000 3.06
⎣⎝ ⎠ ⎦

Since prices in 2000-01 were 1.31% higher than in 1999-00, to convert £100 from 1999-00 to
2000-01 prices (make the inflation adjustment) we multiply by 1 + [inflation rate], which in this
case is:

£100 * (1 + 0.0131) = £101.31

Essentially, this calculation tells us that in 2000-01, we would need £101.31 to buy the same
real value of goods that £100 would have bought in 1999-00. We do not always need to
explicitly calculate the inflation rate. A shortened version of this calculation divides the £100
by the 1999-00 index and multiplies by the 2000-01 index:

⎛ 83.051 ⎞
£100 * ⎜ ⎟ = £101.31
⎝ 81.976 ⎠

Clearly, the inflation rate depends on which goods one buys; some may have fallen and
others risen in price during the year. This is why the distinction between different types of
index is important, since they measure the price changes of different sets of goods and/or
services. This is dealt with in Section 2.

Non-consecutive years
In 2001-02, inflation was 2.23%, so to express our £100 from 1999-00 in 2001-02 prices, we
take the £101.31 from 2000-01 and perform the same calculation; namely

£101.31 * (1 + 0.0223) = £103.57

What about expressing £100 in 2007-08 prices? Clearly, applying our current method will
involve six more tiresome calculations. This is where the index is useful. If inflation between
⎡⎛ 83.051 ⎞ ⎤
1999-00 and 2000-01 was ⎢⎜ ⎟ − 1⎥ , then total inflation between 1999-00 and 2007-08
⎣⎝ 81.976 ⎠ ⎦
⎡⎛ 100 ⎞ ⎤
is ⎢⎜ ⎟ − 1⎥ = 22.0%
⎣⎝ 81.976 ⎠ ⎦

and more generally, inflation between years y and x is

3
N.B. 1.31% = 0.0131 50% = 0.5 2000% = 20 etc.

2
⎡⎛ index.number. for. year..x ⎞ ⎤
⎢⎜⎜ ⎟⎟ − 1⎥ , bearing in mind that x is further ahead in time than y.
⎣⎝ index.number. for. year.. y ⎠ ⎦

With a total 22.0% inflation between 1999-00 and 2007-08, we can express our £100 in
2007-08 prices by multiplying by [1 + the inflation rate] as usual:

£100 * 1.22 = £122

Again we can shorten this calculation if we do not need to explicitly calculate the inflation
rate. In general to express a value from year y in year x prices:

⎛ index.number. for. year..x ⎞


Year y value * ⎜⎜ ⎟⎟ = year y value in year x prices
⎝ index.number. for. year.. y ⎠

Note that an index number on its own does not signify anything; it is only meaningful in
relation to another index number.

1.3 Constructing a real time series and calculating real changes


Often we want to express a series of actual (nominal) values in real terms. This involves
revaluing every annual figure into a chosen year’s prices (a base year), effectively repeating
the stages above on a series of nominal values and using the same base year (year x above)
for all. Whilst this does mean making as many calculations as there are years to be revalued,
this is a very simple task in a spreadsheet. Once done, changes can be calculated in
percentage or absolute terms. The choice of base year does not affect percentage change
calculations, but it will affect the absolute change figure, so it is important to specify the base
year. 4

1.4 Internal purchasing power


Expressing changes in a currency’s purchasing power (the quantity of resources a set
amount can buy) is essentially the opposite process to expressing an amount in real terms.
To express a figure in real terms, we multiply by [1 + the inflation rate]; to express the
purchasing power of an amount, we divide by [1 + the inflation rate].

Using the previous example, if the purchasing power of £1.00 is 100 pence in 1999-00, in
1
2007-08 its purchasing power is = £0.82 (i.e. purchasing power fell 18%).
1.22
Once again, the price index used has important implications for the interpretation of the
result. Like changes in prices, changes in individuals’ purchasing power are idiosyncratic,
and figures derived from the GDP deflator merely reflect economy-wide average changes.

2 Different price indices


2.1 Overview
A price index is a series of numbers used to show general movement in the price of a single
item, or a set of goods 5 , over time. Thus, insofar as every good has a price that changes

4
For instance we can say ‘a 5% real increase’ without a base year, but ‘a £5 million real increase’ needs a base
year specified, since if prices are rising, a later base year will give us a higher figure and vice versa. It is good
practice to use the latest year’s prices as the base year.
5
Hereon, the expression ‘goods’ encompasses both goods and services

3
over time, there are as many inflation ‘rates’ as there are different groupings of goods and
services. When the media talk about ‘personal inflation rates’, they are referring to changes
in the prices of things that a particular individual buys.

In general, any price index must consist of a set of prices for goods, and a set of
corresponding weights assigned to each good in the index. 6 For consumer indices, these
weights should reflect goods’ importance in the household budget; a doubling in the price of
chewing gum should not, for instance, affect the index as much as a doubling in energy bills.

2.2 The GDP deflator


The GDP deflator measures the change in price of all domestically produced goods and
services. It is derived by dividing an index of GDP measured in current prices by a constant
prices (chain volume) index of GDP. The GDP deflator is different from other inflation
measures in that it does not use a subset of goods; by the definition of GDP, all domestically
produced goods are included. In addition, there is no explicit mechanism to assign
importance (weights) to the goods in the index; the weights of the deflator are implicitly
dictated by the relative value of each good to economic production.

These unique features of the GDP deflator make its interpretation slightly less intuitive.
Whilst consumer/producer inflation indices reflect average change in the cost of goods
typically bought by consumers/producers, the GDP deflator is not representative of any
particular individual’s spending patterns. From the previous example, a possible
interpretation could be as follows: suppose in 1999-00 we spent the £100 on a tiny and
representative fraction of every good produced in the economy; then the GDP deflator tells
us that we would need 1.31% more money (£101.31) to buy that same bundle in 2000-01.

The GDP is normally used to adjust for inflation in measures of national income and public
expenditure where the focus is wider than consumer items alone. The Treasury indices give
financial years only, but it can be calculated for calendar years or quarters.

2.3 The Consumer Price Index (CPI) and the Retail Price Index (RPI) 7
The ONS publishes two measures of consumer price inflation: the CPI and the RPI 8 . Each is
a composite measure of the price change of around 650 goods and services on which people
typically spend their money. The most intuitive way of thinking about the CPI/RPI is to
imagine a shopping basket containing these goods and services. As the prices of the items in
the basket change over time, so does the total cost of the basket; CPI and RPI measure the
changing cost of this basket.

A ‘perfect’ consumer price index would be calculated with reference to all consumer goods
and services, and the prices measured in every outlet that supplies them. Clearly, this isn’t
practicable. The CPI/RPI use a representative sample of goods and the price data collected
for each good is a sample of prices: ‘currently, around 120,000 separate price quotations are
used… collected in around 150 areas throughout the UK’ (ONS, 2008).

RPI and CPI data can both be downloaded from the ONS Consumer Price Indices release.
The RPI is commonly used to adjust for inflation faced by consumers. It is also used as a
basis to uprate many state benefits and as the normal interest rate for student loans.

6
Clearly, when charting just one price, there is no need for weighting, since the good assumes complete
importance in the index.
7
Differences between the CPI and the RPI are not considered in this note; broadly they arise from differences
in formula and coverage. Some further background is given in the standard note on the Retail Prices Index
8
Another commonly-used index, the RPIX, simply excludes mortgage interest payments from the RPI

4
2.4 Selection of items and application of weights in the CPI/RPI
Some items in the CPI/RPI are sufficiently important within household budgets that they merit
their place in the basket per se: examples include petrol, electricity supply and telephone
charges. However, most goods are selected on the basis that changes in their price reflect
price changes for a wider range of goods. For instance, the CPI contains nine items which
together act as ‘bellwethers’ for price change in the ‘tools and equipment for the house and
garden’ class.

The 2008 CPI classes and the weights assigned to them are shown in the table below,
alongside the proportion of representative items assigned to each class. The number of
items assigned to a class depends on its weight in the index, and the variability of prices
within the class; for instance, tobacco has only five items whilst food has over a hundred.

CPI allocations, 2008


Percentage of total
Class Weight in CPI index representative items
01 Food and Non-Alcoholic Beverages 10.9% 22%
02 Alcoholic Beverages and Tobacco 4.2% 4%
03 Clothing and Footwear 3.6% 11%
04 Housing, Water, Electricity, Gas and other fuels 11.5% 5%
05 Household Furnishings, Equipment and maintenance 6.7% 11%
06 Health 2.2% 3%
07 Transport 15.2% 6%
08 Communications 2.3% 1%
09 Recreation and Culture 15.2% 17%
10 Education 1.9% 1%
11 Restaurants and Hotels 13.7% 8%
12 Miscellaneous Goods and Services 9.9% 11%

Each year, the weights and contents of the basket are reviewed, and alterations are made to
reflect changing patterns of consumer spending. Spending changes may arise from
substitution in the face of short-term price fluctuation or changing tastes (e.g. butter to
margarine), or from the development of new goods. For example, in 2008 35mm camera
films were replaced by portable digital storage media. In total, eight items were dropped from
the CPI/RPI in 2008, including CD singles, lager ‘stubbies’ and TV
Monthly RPI index
repair services.
Jan 2007 to Dec 2008

Year, month Index


3 Further calculations using monthly CPI/RPI data
2007 01 201.6
The table opposite shows monthly RPI data for 2007 and 2008, and is 2007 02 203.1
used as a reference point for the following examples. 2007 03 204.4
2007 04 205.4
3.1 Calculating inflation between specific months 2007 05 206.2
2007 06 207.3
This is done in precisely the same way as for annual data. Inflation 2007 07 206.1
between Dec 2007 and Dec 2008 is simply 2007 08 207.3
⎡⎛ 212.9 ⎞ ⎤ 2007 09 208.0
⎢⎜ 210.9 ⎟ − 1⎥ = 0.95% 2007 10 208.9
⎣⎝ ⎠ ⎦ 2007 11 209.7
2007 12 210.9

3.2 Calculating average annual inflation 2008 01 209.8


2008 02 211.4
This might be done in order to convert the RPI series into an 2008 03 212.1
annualised format (e.g. for direct comparison with the GDP deflator). 2008 04 214.0
The principle is once again the same, but rather than take specific 2008 05 215.1
2008 06 216.8
index numbers twelve months apart, we use the arithmetic mean
(average) of the twelve index numbers in each year. 2008 07 216.5
2008 08 217.2
2008 09 218.4
5 2008 10 217.7
2008 11 216.0
2008 12 212.9
Let I 07 equal the average of the twelve monthly index numbers in 2007, and I 08 the
average of the monthly index numbers in 2008.
⎡⎛ I ⎞ ⎤
Then average annual inflation between 2007 and 2008 = ⎢⎜⎜ 08 ⎟⎟ − 1⎥ = 3.99%
⎣⎢⎝ I 07 ⎠ ⎦⎥
3.3 Calculating average annual inflation over periods other than a year

This can be done using the following formula:

⎡ 12

⎢⎛⎜ I x ⎞⎟ n ⎥
⎢⎜ I ⎟ − 1⎥
⎢⎣⎝ y ⎠
⎥⎦

Where n is the number of months in the period in question, I is the index number, and x
is further ahead in time than y.

So, the average annualised rate of RPI inflation for the second half of 2008 is:

⎡⎛ 212.9 ⎞ 2 ⎤
⎢⎜ ⎟ = −3.38%⎥
⎣⎢⎝ 216.5 ⎠ ⎦⎥

In calculating average annual inflation over short periods, the ONS offers the following
caveat: 9

It should be noted that this may produce misleading results for just one or two months’
change in the index. One reason is that the month-to-month change includes a
seasonal component. Another is that some prices change only infrequently, perhaps
only once a year. Hence a comparison between different years’ annual average indices,
or at least between the same month in different years, is to be preferred.

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

9
ONS Consumer Price Indices Technical Manual – 2007 edition

6
Statistical literacy guide
Measures of average and spread
Last updated: February 2007
Author: Richard Cracknell

Am I typical?
A common way of summarising figures is to present an average. Suppose, for
example, we wanted to look at incomes in the UK the most obvious summary
measurement to use would be average income. Another indicator which might be of
use is one which showed the spread or variation in individual incomes. Two
countries might have similar average incomes, but the distribution around their
average might be very different and it could be useful to have a measure which
quantifies this difference.

There are three often-used measures of average:

• Mean – what in everyday language would think of as the average of a set of


figures.
• Median – the ‘middle’ value of a dataset.
• Mode – the most common value

Mean
This is calculated by adding up all the figures and dividing by the number of pieces of
data. So if the hourly rate of pay for 5 employees was as follows:

£5.50, £6.00, £6.45, £7.00, £8.65

The average hourly rate of pay per employee is

5.5+6.0+6.45+7.0+8.65 = 33.6 = £6.72


5 5

It is important to note that this measure can be affected by unusually high or low
values in the dataset and the mean may result in a figure that is not necessarily
typical. For example, in the above data, if the individual earning £8.65 per hour had
instead earned £30 the mean earnings would have been £10.99 per hour – which
would not have been typical of those of the group. The usefulness to the mean is
often as a base for further calculation – estimated the cost or effect of a change, for
example. If we wanted to calculate how much it would cost to give all employees a
10% hourly pay increase, then this could be calculated from mean earnings
(multiplied back up by the number of employees).

Median
If we are concerned with describing a set of date by giving an average or typical
value then it is sometimes preferable to use the median rather than the mean. The
median is the value such that exactly half the data items exceed it and half are below
it.

The conventional way of calculating the median is to arrange the figures in order and
take the middle value. If there is no middle value because there is an even number
of figures, then, conventionally, the median is take to be mid-way between the two

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
middle points. In the earnings example the middle value is £6.45 and this is the
median for that data:

£5.50, £6.00, £6.45, £7.00, 8.65

The median is less affected by values at the extremes than the mean. It can
therefore be a better guide to typical values.

Mode
The mode is the value that occurs most frequently. It is often thought of as not
particularly useful in statistical textbooks! But in real life we often use the mode,
without realising we are using a measure of average. The ‘top 10’, ‘most popular’,
‘2nd favourite’ are simply looking at the most common, or 2nd most common values,
ie. modal measures..

Grouped data
Sometimes we do not have exact values, instead the data have already been
grouped into bands – 1 to 10, 11 to 20, 21 to 30 …etc. While it is not possible to
exactly calculate the mean from grouped data, an estimate can be made by
assigning the mid-point of each band to the observations in that group. This rests on
the assumption that the actual values are spread evenly across within each band.
Sometimes these classes include open-ended groups – over 50, less than 5 etc. In
these cases you have to make some intelligent guess at an appropriate value.
Where you have done this, you can assess how sensitive your estimate is to the
assumed value for open classes by re-calculating the average using an alternative
assumption (using a spreadsheet to do the calculations also makes it easy to
investigate this).

It also possible to estimate the median for grouped data, by looking for the class
above and below which 50% fall. Sometimes it is necessary to estimate where the
50% boundary is within a class.

Other averages
There are a number of other measures of average, some of which are briefly
described below:

• Geometric mean - the nth root of the product of n data values


• Harmonic mean - the reciprocal of the arithmetic mean of the reciprocals of
the data values
• Quadratic mean or root mean square (RMS) - the square root of the
arithmetic mean of the squares of the data values
• Generalized mean - generalizing the above, the nth root of the arithmetic
mean of the nth powers of the data values
• Weighted mean - an arithmetic mean that incorporates weighting to certain
data elements
• Truncated mean - the arithmetic mean of data values after a certain number
or proportion of the highest and lowest data values have been discarded
• Interquartile mean - a special case of the truncated mean
• Midrange - the arithmetic mean of the highest and lowest values of the data
or distribution.
• Winsorized mean - similar to the truncated mean, but, rather than deleting
the extreme values, they are set equal to the largest and smallest values that
remain

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Weighted average/mean
An average calculated as the arithmetic mean assumes equal importance of the
items for which the average is being calculated. Sometimes this is not appropriate
and you have to allow for differences in size or importance. A simple example would
be if you were looking at incomes of pensioners. It the average income of female
pensioners were £150 per week and the average for male pensioners £200 – it would
be wrong to say that the average for all pensioners was £175 [(150+200)/2]. There
are around twice as many women in this age group than men and this needs to be
taken into account in calculating the overall average. If we give twice as much
weight to the value for women than for men, the overall average comes to £167. The
calculation of this is set out below:

£pw Weight Weight x


value
Women 150 2 300
Men 200 1 200
Total 3 500

(Total, weight x value) / (Total weights) = 500 / 3 = £167

Measures of variation / spread

Range and quantiles


The simplest measure of spread is the range. This is the difference between the
largest and smallest values.

If data are arranged in order we can give more information about the spread by
finding values that lie at various intermediate points. These points are known
generically as quantiles. The values that divide the observations into four equal
sized groups, for example, are called the quartiles. Similarly, it is possible to look at
values for 10 equal-sized groups, deciles, or 5 groups, quintiles, or 100 groups,
percentiles, for example. (In practice it is unlikely that you would want all 100, but
sometimes the boundary for the top or bottom 5% or other value is of particular
interest)

One commonly used measure is the inter-quartile range. This is the difference
between the boundary of the top and bottom quartile. As such it is the range that
encompasses 50% of the values in a dataset.

Mean deviation
For each value in a dataset it is possible to calculate the difference between it and
the average (usually the mean). These will be positive and negative and they can be
averaged (again usually using the arithmetic mean). For some sets of data, for
example, forecasting errors, we might want our errors over time to cancel each other
out and the mean deviation should be around zero for this to be the case.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Variance and standard deviation
The variance or standard deviation (which is equal to the variance squared) is the
most commonly used measure of spread or volatility.

The standard deviation is the root mean square (RMS) deviation of the values from
their arithmetic mean, ie. the square root of the sum of the square of the difference
between each value and the mean. This is the most common measure of how widely
spread the values in a data set are. If the data points are all close to the mean, then
the standard deviation is close to zero. If many data points are far from the mean,
then the standard deviation is far from zero. If all the data values are equal, then the
standard deviation is zero.

There are various formulas and ways of calculating the standard deviation – these
can be found in most statistics textbooks or online 1 . Basically the standard deviation
is a measure of the distance from of each of the observations from the mean
irrespective of whether then differences is positive or negative (hence the squaring
and taking the square root).

The standard deviation measures the spread of the data about the mean value. It is
useful in comparing sets of data which may have the same mean but a different
range. For example, the mean of the following two is the same: 15, 15, 15, 14, 16
and 2, 7, 14, 22, 30. However, the second is clearly more spread out and would have
a higher standard deviation. If a set has a low standard deviation, the values are not
spread out too much. Where two sets of data have different means, it is possible to
compare their spread by looking at the standard deviation as a percentage of the
mean.

Where the data is “normally distributed”, the standard deviation takes on added
importance and this underpins a lot of statistical work where samples of a population
are used to estimate values for the population as a whole (for further details see
Statistical significance/confidence intervals in this series).

Excel functions to calculate averages and spread


While it is possible to calculate these from first principles, there are a number of
statistical functions in Excel which are useful shortcut ways of calculating averages
and spread. Excel includes a “wizard” which can be used to insert these functions
into a cell of spreadsheet. Useful functions include:

AVERAGE Returns the average of its arguments – example =Average(A1..A4)

COUNT Counts how many numbers are in the list of arguments

LARGE Returns the k-th largest value in a data set, where you determine k.

MAX Returns the maximum value in a list of arguments

MEDIAN Returns the median of the given numbers

MIN Returns the minimum value in a list of arguments

MODE Returns the most common value in a data set

1
For example http://www.beyondtechnology.com/tips016.shtml

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
PERCENTILE Returns the k-th percentile of values in a range

PERCENTRANK Returns the percentage rank of a value in a data set

QUARTILE Returns the quartile of a data set

RANK Returns the rank of a number in a list of numbers

SMALL Returns the k-th smallest value in a data set

STDEV Estimates standard deviation based on a sample

STDEVP Calculates standard deviation based on the entire population

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Chart format guide
This guide and sets out some principles and conventions for statistical charts. It is a version of that used in the House of Commons Library. The
first part sets out some general principles to follow for any chart and then covers the default formatting used by the Social & General Statistics
Section of the Library. Charts are by their nature more subjective than tables. A chart works on visual and different analytical levels is open to
greater interpretation. There are far more options to choose from covering types of chart, colour, size, dimensions, labelling, scales etc.
Therefore the Library has a limited number of definitive hard and fast ‘rules’ covering charts. The appendix to the guide goes beyond the
standard formatting of charts used in the Library. It is intended to be thought provoking rather than prescriptive and makes some suggestions
on further aspects of chart layout, such as when to use certain chart types, highlights the weaknesses of other chart types and looks at
combining multiple charts.

This guide draws on the work of a number of different authors, particularly:

-Edward Tufte, The Visual Display of Quantitative Information, (1987); Envisioning Information, (1990), Visual Explanations
Images and Quantities, Evidence and Narrative, (1997); and Beautiful Evidence (2006)
-Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps, (1983)
- William S. Cleveland, The elements of graphing data (1985)

Much of this guide focuses the different elements and capabilities of Excel (2003) charts, but it does not include step-by-step instructions or any
guidance on advanced charts 9those which extend Excel’s basic capabilities)
Before After
General principles
The general principles that should be applied to the choice and Proportion of pupils in large classes (>30), England
1978-
construction of any chart are: 45%
Proportion of pupils in large classes (>30), England 1978-

0.45
40%
0.4
Accuracy -the patterns in the chart accurately reflect the 0.35 35%
0.3
underlying data 0.25 Primary
30%
Primary schools

Economy –the chart includes only those elements which 0.15


0.2 Secondary 25%

20%

display the data and those necessary to understand it. 0.05


0.1
15%
Secondary
Clarity –the patterns/values the chart depicts are as easy as 0 10% schools

1978
1980
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
possible for the reader to interpret. 5%

0%
1978 1982 1986 1990 1994 1998 2002 2006

One might also add the aims of self sufficiency (the chart needs as little
Subsidies and other payments made to farmers in the
text explanation as possible) and high data density (the data elements Subsidies and other payments made to farmers in the UK
UK
of the chart take up as much of the space as possible and include the 4500 4.5
4.0
£ billion 2007 prices
Decoupled and other payments
4000
maximum amount of information per unit of area), but both are related 3500 3.5
Payments linked to production
£ billion 2007 prices

to economy and clarity. 3000


2500
Decoupled and other
payments
3.0
2.5
2000 Payments linked to
2.0
production
1500
A default Excel chart violates many of these principles. Some basic 1000
1.5
1.0
changes will make very clear improvements. Before and after examples 500
0.5
0
1 0.0
73
76
79
82
85
88
91
94
97
00
03
06

1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003 2006
19
19
19
19
19
19
19
19
19
20
20
20

Source: Agriculture in the UK 2006 , Defra Source: Agriculture in the UK 2006 , Defra
are illustrated opposite. These steps will improve a chart, but not
always produce an ideal chart which can require extending the general
principles with some additional thought and effort. The steps listed in
this guide are necessary but not always sufficient to produce the best
chart possible. Some other suggestions are given in the appendix to
this guide.
Step-by-step impact of
First steps -what to leave out removing Excel defaults
The principle of economy means that each mark on the chart should
have some meaning, none of the elements are superfluous and ‘data 6
5
6
5
ink’ (that which shows the pattern in the underlying data) is maximised 4 4
3 Series1 3 Series1
compared to the other chart elements. This leads to the following rules 2 2

for the exclusion of specific elements of a default Excel chart (also 1 1


0 0
illustrated opposite): A B C D E F A B C D E F

-Chart area –no border around the chart


6
-Plot area –no border and no pattern (colouring in) 6
5
5

-Legend –no border 4


4
3 Series1
-Gridlines –none 3
2
Series1
2
1
-Colour elements of data series –no borders and no markers 1
0
0
A B C D E F
A B C D E F

These remove the worst of Excel’s default formatting and concentrates


the reader’s mind (and eye) on the data. The next section looks in more 6 6
detail at different elements. 5 5
4 4
3 Series1 3 Series1

Next steps –what to include 2


1
2
1
0 0
A B C D E F A B C D E F
Font –arial throughout.
Title –as with tables this needs to be descriptive, including the
what, where and when. It may also include units. Font needs to Y-axis labels moved to title and
be larger than all other text elements and in bold plot area increased
Source –needed if not given in associated table. Font smaller Frequency of categories, UK, 2007 Frequency of categories, UK, 2007,
thousands
than all other text elements and in italics. Placed at the bottom 6,000 6

of the chart 5,000


4,000
5
4
3,000 Series1
Y-axis units –to minimise the space taken up by labels include 2,000
3
2
Series1

as few digits as possible and add the magnitude in the title (ie. 1,000 1
0
1, 2, 3, etc. instead of 1,000,000, 2,000,000, 3,000,000 etc. – A B C D E F
0
A B C D E F

see opposite). Rounding and decimal places must be

2
Chart changed from column to bar to
consistent. Text size between title and source sizes, same as make categories easier to read
legend and x-axis labels. Frequency of categories, UK, 2007, Frequency of categories, UK, 2007,
thousands thousands
X-axis labels –In most cases these will be categories or dates. 6 First lot of things

These must be horizontally aligned. This may mean that Excel 5


4 Series1
Second lot of things
Series1
does not include every date (preferable to abbreviating dates to 3
2
Third lot of things

Fourth lot of things


00/01, 01/02 etc.) or it wraps long category titles. Consider 1
Fifth lot of things
0
switching from a column to a bar chart if the latter is a problem First lot Second Third Fourth Fifth lot Sixth Sixth lot of things
of lot of lot of lot of of lot of
(see opposite). Use country abbreviations for column charts. things things things things things things 0 1 2 3 4 5 6

Text size smaller than title, bigger than source, same as legend
and y-axis labels. Spending by categories, UK, 2007 Spending by categories, UK, 2007
£ billion
Axis titles -If you must include them they should also be 6
cash
6

horizontally aligned (this means including them at the top of the 5 5

£ billion cash
4 Series1
y-axis). It is frequently better to put the units in the title or 3
Series1 4
3

subtitle. This keeps all the description in one place and again 2 2
1
allows you to maximise the data plot area (see opposite) 1
0 0
A B C D E F A B C D E F

Axis ‘tick marks’ –Ensure they are set to outside (the default).
If you are using a line chart then checking the ‘value (Y) axis
Steps from default axis
crosses between categories’ on the scale tab will make better label, to an improved Spending by categories, UK, 2007,
use of the plot area. version, then label in £ billion cash
6
subtitle (preferred)
Legend –Direct labelling of lines is preferable. If you include a 5

4 Series1
legend add it into an empty part of the plot area. The defaults of 3

top/side/bottom, all squeeze the area devoted to showing the 2

1
data (see below opposite) 0
A B C D E F
Line-type –Do not select ‘smoothed’
Data labels –Avoid wherever possible. They go against the
principle of economy and your chart should not need them.
Legend moved inside plot area
which is then expanded
Colour Spending by categories, UK, 2007, Spending by categories, UK, 2007,
The default colour used by the Library is ‘House of Commons green’ £ billion cash £ billion cash
(the green option on the default range of colours). Where more than 6 6
Series1
5 5
one colour is needed use shades of this green. There are eight shades 4 4
Series1
of this green used by the Library (shown below). Chose contrasting 3
3
shades as far as possible. 2
2
1
1
0
A B C D E F 0
A B C D E F

Use colours not patterns for different categories in a line chart. Dotted
lines can be used where they add meaning (ie. for projections or breaks
3
in series). Do not use fill effects for varying chart colours as the pattern Ordering the categories improves
options look poor when printed or converted to a PDF document. the display of the distribution and
gives a ranked list on the x-axis.

Ordering categories 6 6
Most category data (types of things such as areas, groups of people, 5 5
industry, modes of transport and crimes) can be reorganised or sorted 4 4
3 3
by value in charts to help the reader better understand the data
2 2
(opposite). The result is a better idea of the distribution of the data 1 1
while the information on individual values is retained. The category axis 0 0
A B C D E F
then effectively ranks the different categories by value. The default F A C E D B

ordering should be from highest to lowest.

Occasionally category charts are not ordered and the ‘standard’ order
is given. There are a few instances where categories cannot be re-
ordered (strictly sequential categories such as social class or age
groups) and even the categories that are normally given in a set order The first chart imparts little information other than
(ethnicity, industrial classification, religion, sources of emission etc.) max, min and rough average. The frequency
can be sorted when included in a chart. distribution alternative is more informative.
90%
80% Number
30
Ordering a very large number of categories, such as local authorities, 70%
25
means you lose the category identity. There is not space to include 60%
20
50%
them in all but the largest of charts. You are left with a chart of what is 40% 15
effectively a very similar shaped curve for most distributions that are 30% 10
20%
approximately normal. This gives the reader very little information apart 10%
5

from max/min and the fact that most values are close to the median. A 0% 0
35%- 40%- 45%- 50%- 55%- 60%- 65%- 70%- 75%- 80%- 85%-
frequency distribution chart (opposite) better identifies outliers and at
least gives the reader some information they can use such as numbers
above or below or in a specific value/range.

Axis values Excel’s default ‘concertina-ing’


of dates and the correct shape
Missing dates of the data
Where a time series has missing values for certain dates or the gaps
between data varies, the chart must display this. If just places the dates SGS enquiries (thousands) SGS enquiries (thousands)
12 12
with data next to each other it violates the accuracy principle (see 10 10
opposite). 8 8

6 6

4 4

2 2

0 0
1986 1992 1996 2000 2001 2002 2003 2004 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

4
Shortened y-axis
If the data you add to a chart does not vary a great deal Excel will be
‘helpful’ and cut the y-axis, ie. it will not start at zero. This means that
underlying patterns in the data are magnified. In other words the The same data plotted first as
accuracy principle is violated. A helpful way of thinking about this is to Excel would do by default. Then
the uncut version.
quantify the degree of magnification. That show opposite (by Excel 120 120
default) is 10. In other words the variation in the chart produced by 115 100
Excel (on the left) is 10 times greater than the real variation (on the 110 80
right). This degree of magnification has been dubbed the ‘lie factor’. 1 105 60

100 40
This issue is looked at in more depth in the appendix. This gives some 95 20
suggested alternatives to cutting the y-axis and sets out what should be 90 0
done when there is no alternative. 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

1
See Edward Tufte The Visual Display of Quantitative Information, (1987), p.57
5
Appendix
Below is a collection of alternatives to ‘standard’ charts with an
illustration of how they can build on and improve the earlier basic
‘rules’. All these follow from the principles outlined at the start of the
main guide. The sections cover shortening the y-axis, size, dimensions,
the choice of chart type and expanded or composite charts. As
mentioned earlier the examples and suggestions given here are meant
to be thought provoking. The aims are to get authors to think more
thoroughly about how data should be displayed in a chart and to offer
some general guidance. The recommendations are:

Shortening the y-axis.


Cutting the y-axis violates the accuracy principle. Social Dual axes charts
Trends has a very large number of charts, but the latest edition Any dual axis chart is potentially misleading, even one with
has only one with a shortened y-axis. There are a number of combined chart types (ie. lines and bars). These charts should
alternative approaches. If none of these help and you can still therefore not be used.
justify shortening the axis then at the very least you should
indicate the axis is shortened with a . Expanded or composite charts
Expanded or composite charts are multi-chart display with the
Chart size intention that the reader makes direct comparison between the
Many of the charts we produce are much bigger than they (related) data. Because each single chart has a single series
have to be. An appropriate use of smaller charts means one there are no problems with clarity of stacked series and no
can better integrate charts with text. It also gives you greater jumble of multiple crossing lines or confusing side-by-side
flexibility about the placement of charts. columns They should be considered as alternatives to stacked
or side-by-side charts for both category and time series data.
Dimensions
The default aspect should be landscape. There are some other
times where you might want to change the aspect ratio to help
the reader better understand the data.

Chart type
Simple charts are more effective. Pie charts are less clear than
bar/column charts. Stacked charts are often less clear than a
sequence of simple line/bar/column charts. The pie chart falls
down on the principles of self-sufficiency and economy. It has
other limitations and therefore should only be used in
exceptional circumstances.
Line charts should be the default option for time series charts.

6
Shortening the y-axis
The section in the main guide gives a general introduction to the issue.
The main point is that cutting the y-axis violates the accuracy
principle.

We can not always assume that people use charts to look back at
precise values on y-axis, or even look at the values at all. Charts The overall trend is clearly down,
but by what %?
illustrate patterns and the reader’s initial focus will be on the shape
shown –the relative values. Take the chart opposite (lie factor 4); what
330
are the actual proportionate changes? With a full axis the shape of the 320
data elements give an accurate illustration of the change over this 310
period. To do this for the chart opposite you need to look back to the y- 300
290
axis to see the change in numbers and do a relatively complex sum in
280
your head to come up with the actual change over the period ((270- 270
320)/320) which is not a sum that every reader can or will do. More 260
fundamentally charts are not an effective way to get reference 250
240
information across.
1990 1993 1996 1999 2002 2005

The same principle applies to index charts which simply change the
scale on the y-axis. While they only give relative values on the y-axis it
still needs the reader to i) read across to values on the y-axis and ii)
understand what an index chart means.

It is extremely common to see trend charts with a shortened y-axis in Here both an accurate picture of the
the press. A complete y-axis can, like so many facts, get in the way of a underlying data and some detail on
good story. Some charts, such as trends in the stock market, are rarely annual variations are given
if ever shown with a full axis down to zero. Official statistics use a 350
shortened y-axis much less frequently. Social Trends has a very 300
large number of charts, but the latest edition has only one with a 250
shortened y-axis. Detail
200
320

There are a number of alternative approaches. These include: 150 300

280
-Adding more historical data. This will frequently include a 100 260

greater range of figures and put relatively small recent changes


50
in context.
0
-Add a small version of the chart as inset detail that cuts the y- 1990 1992 1994 1996 1998 2000 2002 2004 2006
axis (opposite).

7
Ditch the chart- If you have a flat series which shows nothing of
any interest you should ask yourself whether you need to
include a chart to show this at all.
-Leave it alone- Keep the chart as it is if the small changes
actually represent an important result in themselves. Try to
separate the substantively/politically important results from
those that might make a chart look more interesting or nice.
Similarly a chart that shows large variations may have no
political relevance or be of little interest to readers.
-Chart the percentage change. This is potentially as misleading
as a shortened axis. It may however help in a few cases where
the percentage change in an indicator is more important than
absolute values. The obvious example is inflation. Problems In this example the axis is cut as the observered
increase in concentrations is deemed important. In
arise with longer-term trends as over time denominators addition, the point at where it is cut is the pre-
change and a 5% increase now may be very different from a existing baseline and concentrations have never
5% change 10 or 20 years ago. Such a chart is less data dense approached zero. The cut in the y-axis is indicated
with a zig-zag line at the base
as it strips out absolute values and gives a poor indication of
changes in absolute levels over time. Journalists are not the
Monthly mean atmospheric carbon dioxide at Mauna Loa Observatory, Hawaii
only ones who confuse a reduction in a percentage increase (parts per million)
390

with a reduction in the underlying absolute value. 380

370

If none of these help and you can still justify shortening the axis 360

(ie. the change is important in actual/substantive/political terms) 350

then at the very least you should indicate the axis is shortened 340
with a zig-zag . This will help the reader and make our charts more
330
truthful. If there is no time (this can be a fairly long process) then it
320
should be mentioned in the accompanying text at very least (“note the
shortened axis which over emphasises small changes”). The example 310

opposite sets out an appropriate use and display of a shortened value 300

axis. 290
Estimated pre-industrial concentrations
280

1958 1963 1968 1973 1978 1983 1988 1993 1998 2003 2008
Source: US National Oceanographic & Atmospheric Administraton -Global Monitoring Division www.cmdl.noaa.gov/index.php

8
Dimensions and scale € 35
Excel produces a chart object in a standard size and dimensions. While
these may be adjusted to fit some additional chart elements, there is € 30
scope to do much more.
€ 25

Scale € 20
Many of the charts we produce are much bigger than they have to
€ 15
be. The pattern in the data stays the same however small the chart and
the human eye is capable of accurately perceiving very small € 10
differences (such as in letters, fonts and maps). As an illustration the
€5
charts opposite show the same data in an increasingly small area.
Each subsequent chart takes up one half the space of its predecessor. €0
The first chart already has a relatively high data density. What exactly Apr 2005 Oct 2005 Apr 2006 Oct 2006 Apr 2007 Oct 2007 Apr 2008 Oct 2008
is lost with the reduction in scale? The major loss is in text size, which
€ 35
soon becomes unreadable. The key aspects of the trend (the macro or
€ 30
global reading) are still clear in the smallest chart. The ability to read
€ 25
across to exact dates/values (micro or elementary level of reading) € 20
starts to fall off from 75% reduction and greater. There is clearly some € 15
loss in trend detail but this only really hits at a reduction in size of 94% € 10
(second to smallest). €5

€0
This all raises the question what exact message are you trying to get Apr Oct Apr Oct Apr Oct Apr Oct
2005 2005 2006 2006 2007 2007 2008 2008
across? Charts are not good ways to impart reference information.
Sometimes you do want to say something about the detail, but what € 35

resolution do you need to impart even this information? This is not an € 30

€ 25

argument to reduce all charts to a microscopic level. But an € 20

appropriate use of smaller charts allows for a much better € 15

€ 10
integration of text and charts, it gives you greater flexibility about €5

the placement of charts, reduces the likelihood of large areas of white €0


Apr 2005 Oct 2005 Apr 2006 Oct 2006 Apr 2007 Oct 2007 Apr 2008 Oct 2008

space and means you are much better able to compare multiple charts
as you can fit more on a single page/screen. 2 € 35

€ 30

€ 25

€ 20

If the text in your chart gets in the way of reducing the data area to the
€ 15

€ 10

size you want it then reduce the number of labels on each axis and
€5

€0
Apr 2005 Oct 2005 Apr 2006 Oct 2006 Apr 2007 Oct 2007 Apr 2008 Oct 2008

consider changing the title. Title text can be reduced or even removed € 35

by additional details given in accompanying text or section headings. € 30

€ 25

€ 20

€ 15

€ 10

€5

€0
Apr 2005 Oct 2005 Apr 2006 Oct 2006 Apr 2007 Oct 2007 Apr 2008 Oct 2008

€ 35

€ 30

€ 25

€ 20

€ 15

€ 10

2
For a very thorough outline and discussion about the use of even smaller charts see the €5

€0
Apr 2005 Oct 2005 Apr 2006 Oct 2006 Apr 2007 Oct 2007 Apr 2008 Oct 2008

Sparklines: theory and practice topic on www.edwardtufte.com


9
If you look at the charts on the financial pages of any non-tabloid you
will find plenty of charts which are much smaller than ours and Here a trend of data on sunspot numbers is
illustrated. The point is that the decline from the
generally include more data (hence much higher data density). We may peak is generally more gradual than the
not aspire to repeat much of their work on statistics, but these are good increase. This is only apparent in the second
examples of charts integrated with text and with a high data density. chart which takes up around 1/4 of the area

Dimensions
The default aspect should be landscape. It makes writing horizontal
text more straightforward and the human eye is well used to perceiving
small deviations from a line moving left to right (vertical variations from
the horizon). Much of the time the default aspect ratio given by Excel to
chart objects (1.8:1) is fine. The actual aspect of the plot area will be
somewhat different depending on title, axis labels etc. Some of the time
a slight adjustment to the height/breadth is made to allow for a slightly
longer title or to ensure all category labels fit in. 1701 1751 1801 1851 1901 1951 2001

There are some other times where you might want to change the
aspect ratio for more technical reasons.
-Variations in the gradient of very spiky trends are very hard to
1701 1751 1801 1851 1901 1951 2001
judge. A change in the aspect ratio to increase the SIDC-team, World Data Center for the Sunspot Index, Royal Observatory of Belgium, Monthly
breadth:height ratio helps as people can better judge variations Report on the International Sunspot Number, online catalogue of the sunspot index:
in slopes where they average 45o (see opposite). 3 http://www.sidc.be/sunspot-data/

Proportion of pupils in large classes (>30), England


-More than one chart with different date or value ranges. Where 1978-
45%
similar data is present in a similar format on the same page the 40%
user will make comparisons whether that is your intention or 35%

not. You should try as much as possible to align the value/dates 30%
Primary schools
to ensure that a set distance means the same value or date 25%

range in each chart. If all dimensions are the same yet data The second chart is
20%

ranges are different the reader may make incorrect inferences. extended to make the dates 15%
Secondary
schools
In an ideal world the alignment should be exact (do this where broadly align. This helps the 10%

reader identify common 5%

the point is to make direct comparisons), but an approximate patterns or diverging trends. 0%

method is an improvement for indirect comparisons (opposite) 1978 1982 1986 1990 1994 1998 2002 2006

Average size of UK maintained schools 1950-


-Scatter plots. Square dimensions make sense where the angle 900

of any line of best fit is important. 800

700

600
Secondary
500

400

300

Primary
200
3
This chart updates that first produced by William S. Cleveland in Visualizing Data (1993) as 100
o
an illustration of the method of choosing aspect ratios where slopes bank to 45 0

10 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
-Expanded or composite charts. Where more than one chart is Breakdown of religion by classified NS-SEC,
included in a display then the focus should be on the aspect England & Wales 2001 (% of each religious group)
ratio of the individual parts, the final composite ratio will 20%
frequently be portrait (see opposite). Class 8
10%
-Bar charts with lots of (sub) categories. There is no sense in
0%
keeping a landscape layout if it makes the chart unreadable
(see below). 20%

10% 6 and 7

Change in Democrat % share of the vote 2004-08 0%

Men +5 20%
Women +5
10% 3, 4
and 5
18-29 +12 0%
30-44 +6
40%
45-59 +1

60+ +1 30%

20%
First time voters +16

10%
1 and 2
White +2
0%
Black +7
Jewish Other No Buddhist Hindu Christian Sikh Muslim
Hispanic +14 Religion

Protestant +5

Catholic +7

Jewish +4

Income < $50,000 +5

Income >$50,000 +6

Non-graduates +6

Graduate +4

+0 +2 +4 +6 +8 +10 +12 +14 +16 +18


Percentage points

11
Chart types
Again there are no hard and fast rules that cover every instance. Some
types of charts lend themselves towards certain types of data. The
principles of economy, self-sufficiency and clarity should be followed
when deciding between different types of chart. The first two should be
self evident in most cases but clarity needs some more thought.
Examples of categories 1, 2 and 3
Clarity 1. Identification/comparison of the value of any different
Research has shown that people’s ability to accurately perceive values, category within any of the charts on the left.
or differences in values, varies among the different ways that charts 2. Comparison of the value of one category in one of the charts
display values. They are, in order of effectiveness: 4 on the left with any other in a different chart on the left
3. Identification/comparison of the second or third stacked
series in the chart on the right
1. Position along a common scale
2. Position along a non-aligned scale 8
3. Length
6
4. Angle/slope
5. Area 4
14
6. Volume 2
12
7. Colour
0
10
8
Much of this should come as little surprise. It is easier to understand 8
values expressed in one dimension than in two. Likewise it is easier to 6
6
understand two-dimensions than three. Using colour as the sole 4
indicator of value is poor (there is little choice for maps). Some of the 4
2
other rankings are slightly more nuanced: it is easier to judge the value 2
from a point on a scale (1) than compare points on two identical scales 0
4 0
on different charts (2). It is easier to judge values from a common A B C D E F
baseline (1 or 2) than to compare different elements from a stacked 2
chart (3) where only the first series has a common baseline (opposite). 0
A B C D E F

4
William S. Cleveland; Robert McGill, Graphical Perception: The Visual Decoding of
Quantitative Information on Graphical Displays of Data, Journal of the Royal Statistical
Society. Series A (General), Vol. 150, No. 3. (1987), pp. 192-229.
12
While the conclusions draw and resulting charts suggested by the
authors of this research are not universally accepted, there are some
8
elements from their findings which can be used when deciding between
In a bar chart you are primarily 6
‘standard’ charts. The first step is to match the ranked list of elements judging position along a scale
(as on the previous page) to chart types. Some use more than one (effectively the position of the dot 4

element, but their primary one is: imposed here). Length and area 2
have secondary roles.
0
1. Line chart, simple bar/column chart, dot plot, scatter plot
A B C D E F
2. Comparison of more than one of the above with a common
scale 14
3. Stacked column/bar, area chart (1-3 opposite)
4. Pie chart 12
In a stacked bar for all but the
5. Pie chart, histogram, pictogram, bubble charts first series you are judging the 10
6. Pictogram, 3-D chart (falsely) length (dotted line) when
making comparisons. 8
7. None apart from maps
6
What does this tell us about broad chart types? First, simple charts
4
are better. Second, pie charts are less clear than bar/column
charts. Third, stacked charts (3) are less clear than a sequence of 2
simple line/bar/column charts (see below opposite). Stacked charts 0
become less clear with more stacked series. If only two series are A B C D E F
included then the difference in clarity between this and other options is
much smaller.
Breakdown of religion by classified NS-SEC,
Clarity is not our only consideration; those of accuracy, economy and England & Wales 2001 (% of each religious group)

self-sufficiency still apply. Here self-sufficiency can be defined both as 20%


a lack of need for labels, values etc. to be printed on data series and A ‘standard’ stacked bar,
Class 8
expanded and re-ordered to 10%
familiarity with the type of chart (it needs no explanation). Pie-charts help make the underlying 0%
are very familiar to everyone. However, their lack of clarity is illustrated patterns clearer.
by the fact that nearly all have values and/or percentages listed next to Breakdown of religion by classified NS-SEC,
20%

each slice. People find it hard to accurately judge the absolute and England & Wales 2001 (% of each religious group)
10% 6 and 7

relative slice of each slice, so anything but the simplest pie needs 90% 1 and 2 3,4 and 5 0%

labels and where labels are not needed the two or three pieces of data 80%
6 and 7 class 8
20%
they contain are probably better in the text. Therefore while very 70%
3, 4
common, the pie chart falls down on the principles of self- 60%
10%
and 5

sufficiency and economy. It has other limitations (more on this in 50%


0%

the next section) and therefore should only be used in exceptional 40%
40%
circumstances. 30%
30%

20%
Chart types –pluses and minuses 20%

10%
This table below summarises the main advantages and disadvantages 10% 1 and 2

0%
of the most common chart types. 0%
Christian Buddhist Hindu Jewish Muslim Sikh Other No Jewish Other No Buddhist Hindu Christian Sikh Muslim
religion Religion
13
Positive Negative
Simple, clear, recognisable, Trends less clear for very long
works for categories and time series, small space for
Column time series long category names, inflexible

Simple, clear, recognisable,


works for categories
including those with long Not appropriate for time series,
names, good for very large less recognisable than column
Bar number of categories chart

Data markers can be clunky,


Simple, clear, recognisable, not appropriate for category
works for time series and charts, interpolation of gaps,
Line index charts stacked charts not clear

Recognisable, works for Not category charts, less


time series and stacked flexible and much more data ink
Area charts than a line chart Typical retail price of premium unleaded petrol and
diesel,
Difficulty in perceiving values 120
pence per litre
especially with more than a few
Petrol 100
slices, needs labels, inflexible,
cannot be combined with other Diesel
types, no time series charts, 80
Pie Simple, recognisable never looks right in Excel
60
The only choice for
comparing two variables, 40
correct interpretation of date
Scatter values Limited other uses 20

Which is
0
clearer?
This gives us a basic starting point only. The next few sections outline Jan-94 Jan-96 Jan-98 Jan-00 Jan-02 Jan-04 Jan-06 Jan-08

some more areas in greater detail.


Typical retail price of premium unleaded petrol and
diesel,
Time series 120
pence per litre
Time series are normally illustrated with a column or line chart (area
Petrol 100
chart for stacked lines). The main differences are the greater amount of Diesel
data ink used for a column chart and the interpolation of a line chart. A 80
line chart is really a series of dot markers linked by lines. The line itself
gives a sense of flow and continuity and emphasises the shape of the 60

trend. Lines are much better when more than one series is being 40
illustrated, especially when there are large numbers of time periods
(see alternatives opposite). They are the only option for index charts. 20

0
There may be times where the underlying data lacks continuity and Jan-94 Jan-96 Jan-98 Jan-00 Jan-02 Jan-04 Jan-06 Jan-08
simply joining the dots gives the wrong impression. A column trend
chart may be preferable in such cases and where the chart looks at
14
positive and negative variations with no long term trend. However these The column chart seems to
are the exception and line charts should be the default option for justify itself more, but the
underlying data are the same
time series charts.

An area chart should be preferable to a stacked bar as it conveys the


sense of continuity. An expanded or composite chart is clearer than a
stacked area chart and is mentioned in more detail below. When there
are only a few time periods to illustrate a line can look lost and
meaningless while a column chart looks better (opposite). This is due to
the greater amount of data ink in the line chart. Where so few pieces of
data are included a chart may not be the best option.

Dual axis and combined charts


These are charts with two or more related series where one set of
Here growth in spending Here the growth in spending
values has a different unit of measurement or is much smaller than the seems to far outstrip that appears to barely close the
other. In such cases it needs to be plotted on a secondary value axis if in passenger numbers. gap with passenger numbers
it is to be included in the same chart. To help emphasise this, the chart National rail, public spending and passenger National rail, public spending and passenger
type of one series is sometimes changed (say from a line to a column) numbers
10
numbers
1.2
hence a combined chart–see opposite. A dual axis chart that has 6
Spending (£ billion) left hand scale 2.0 9 Spending (£ billion) left hand scale
combined chart types may be slightly harder to misunderstand, 5 Passengers (billion) right hand scale 8
Passengers (billion) right hand scale
1.0
7
but it is still potentially misleading. 4
1.5
0.8
6
5 0.6
3 1.0
With a single value axis and more than one series, crossing points and 4
0.4
gaps are an essential part of the information in a chart –e.g. voting 2
0.5
3
2
intentions. Crossing points and gaps on a dual axis chart do not have 1
1
0.2

the same fundamental meaning as on a chart with a single axis and are 0 0.0 0 0.0
a function of the scales. The examples opposite plot the same data with 1986-87 1991-92 1996-97 2001-02 2006-07 1986-87 1991-92 1996-97 2001-02 2006-07

different scales and could potentially be interpreted in different ways.


This effect would be more severe were category axes to be cut. Adding
National rail passengers (billion)
connected series to a chart will encourage the reader to think about 1.2
1.0
cause and effect. This may well be the idea, but dual axis charts can 0.8
give even a knowledgeable reader the wrong impression. These 0.6
charts should therefore not be used. 0.4

The alternative uses up a 0.2

There are alternatives. An index chart of all the series will at least not similar amount of space 0.0
but is less likely to be 1986-87 1991-92 1996-97 2001-02 2006-07
be affected of choice of scale and is a very good illustration of relative misunderstood National rail public spending (£ billion)
changes. But choice of index year does affect gaps and crossing points 6

of series and the data has been stripped of its absolute values. Two or 5
4
more separate charts placed next to each other will avoid these 3
problems. These can be reduced in size to keep the same data density 2
1
0
15 1986-87 1991-92 1996-97 2001-02 2006-07
as the original dual axis chart and are more flexible as they can
accommodate different date ranges more easily (see opposite).

Composite or expanded charts A rather muddled column chart with 8


These have been mentioned in the sections above on aspect ratio and two categories is changed to an
6
clarity and dual-axis charts. Fundamentally they are a multi-chart expanded chart which gives a much Country A
better idea of the pattern by 4
display with the intention that the reader makes direct comparison category within each country and 2
between the (related) data. Each individual chart has two elements still allows comparison between
0
(value and either category, time or another value) for an individual countries A B C D E F
category. Different charts have the same two elements for different 8
Country A Country B
categories, which on a single chart could either be stacked or next to 8 6 Country B
Country C
each other. 4
6
2

Because each single chart has a single series there are no 4


0
A B C D E F
problems with clarity of stacked series and no jumble of multiple 4
Country C
crossing lines or confusing side-by-side columns (below and 2
2
opposite and the religion chart on p.9). Normally the individual charts 0
0
will stack ‘upwards’ as the shared time or categories are along the x- A B C D E F
A B C D E F

axis. They could stack upwards and left to right if two additional
categories are added.

120
110
Coal and smokeless fuel
100
90
80
70

120 Electricity
Separating out the individual 110

categories in this example helps the 100


90
reader identify trends that are simply 80
70
not apparent in the original
150
140 Gas
130
Index prices of selected fuel components of the RPI 120
230
indices relative to the all items RPI, January 1987=100 110
100
210 90
Coal and smokeless fuels 80
190 Gas 70
Electricity
170 Heating oils 225 Heating oils

150 200

130 175

110 150

90 125

70 100

50 75
1987 1992 1997 2002 2007
Source: ONS series DOBW, DOBY, DOBX, DOBZ and CHAW

1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007
16
These composite charts all share two elements but are varied by a third
and sometimes fourth. Some of their advantages also apply to multiple
chart displays that do not all share two elements, but have different
value axes. The example at the bottom of page 15 showed an Proportion of pupils in large classes (>30), England
improvement on a dual-axis chart. Here the charts could not have same 45%
1978-

value axis, similar vertical movements in a trend meant different things. 40%

But their underlying data were broadly connected and they shared 35%

dates. The result is not a true composite chart, but a related multiple 30%
Primary schools
25%
chart display that follows the principles outlined at the start and is 20%
aimed at helping the reader better understand the evidence. The 15%
Secondary
example from page 8 (repeated opposite) only shares some of the 10% schools

dates, but when placed on the same page the varying date ranges 5%

would normally mean non-aligned dates for the related data. Here they 0%
1978 1982 1986 1990 1994 1998 2002 2006

are broadly aligned (the second one is lengthened) and some related
Average size of UK maintained schools 1950-
trends are clear. The alternative would be to lose 20 years of data from 900

the second chart which would mean ignoring evidence and reduced 800

data density. 700

600
Secondary
500
Producing true composite charts in Excel is not straightforward, but it 400
becomes easier with practise. An alternative is the ‘broadly aligned’ 300

approach where chart sizes are changed manually to align the scaling 200
Primary

of value and category axes. 100

0
1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
Expanded or composite charts should be considered as alternatives to
stacked or side-by-side charts for both category and time series data.
There will still be occasions where stacked or side-by-side charts are
appropriate (general with small numbers of series). The point of
highlighting an alternative approach is to illustrate their weaknesses
and to make people think about the message they are trying to portray
with a chart and the best way to achieve this.

17
Further online reading

Perceptual Edge
Juice Analytics
OECD factblog on data visualisation
Department of Mathematics and Statistics at York University (US)
Gallery of Data Visualization -The Best and Worst of Statistical
Graphics
Process trends Data Analysis and Visualization with Excel, R and
Google Tools
Presenting data by the Local Government Data Unit –Wales

Paul Bolton
January 2009

18
Statistical literacy guide
Confidence intervals and statistical significance
Last updated: June 2009
Author: Ross Young & Paul Bolton

This guide outlines the related concepts of confidence intervals and statistical significance
and gives examples of the standard way that both are calculated or tested.

In statistics it is important to measure how confident we can be in the results of a survey or


experiment. One way of measuring the degree of confidence in statistical results is to review
the confidence interval reported by researchers. Confidence intervals describe the range
within which a result for the whole population would occur for a specified proportion of times
a survey or test was repeated among a sample of the population. Confidence intervals are a
standard way of expressing the statistical accuracy of a survey-based estimate. If an
estimate has a high error level, the corresponding confidence interval will be wide, and the
less confidence we can have that the survey results describe the situation among the whole
population.

It is common when quoting confidence intervals to refer to the 95% confidence interval
around a survey estimate or test result, although other confidence intervals can be reported
(e.g. 99%, 90%, even 80%). Where a 95% confidence interval is reported then we can be
reasonably confident that the range includes the ‘true’ value for the population as a whole.
Formally we would expect it to contain the ‘true’ value 95% of the time.

The calculation of a confidence interval is based on the characteristics of a normal


distribution. The chart below shows what a normal distribution of results is expected to look
like. The mean value of all cases sampled in the survey is shown with the symbol μ and
each standard deviation is shown with the symbol σ. Just over 95% of the distribution lies
within 2 standard deviations of the average (mean). Thus if it can be shown, or is assumed,
that the statistic is distributed normally then the confidence interval is the mean ±1.96
multiplied by the standard deviation.

So how do we work out what the standard deviation is? Standard deviation measures the
spread of the results and, technically, is calculated as the square root of the variance.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Variance (σ2) is calculated as the average squared deviation of each result from the average
(mean) value of all results from our survey. The example below shows how we can work out
the standard deviation.

A survey of 10 people asked respondents their current salary. These were the results:

£16,500 The mean (average) salary is £34,760, calculated as


£19,300 adding together all the salaries and dividing the total by the
£25,400 number of cases. We calculate the variance (σ2) by
£23,200 squaring the sum of the mean salary subtracted from each
£35,100 individual’s reported salary, totaling these together, and
£46,000 dividing the total by the number of cases in our survey.
£29,000
£38,200
£65,700
£49,200

∑ (16,500-34,760)2+(19,300-34,760)2 …
σ2 = __________________________________

10 (i.e. number of cases)

Otherwise, £2,130,944,000 divided by 10. The variance value is £213,094,400

The standard deviation is calculated as the square root of the variance, hence
√ £213,094,000, or £14,598

Where survey or test results are based on a larger sample size or results are less variable
then the confidence interval will be smaller, other things equal.

In everyday language we often use the term "significant" to mean important and this
normally involves some judgement relevant to the field of study (here substantive
significance). However, in statistical terminology "significant" means probably true and
probably not due to a chance occurrence (statistical significance). A finding may be
statistically significant without being important or substantively significant. For instance in
very large samples it is common to discover many statistically significant findings where the
sizes of the effects are so small that they are meaningless or trivial.

Significance testing is normally used to establish whether a set of statistical results are
likely to have occurred by chance. This may be to test whether the difference between two
averages (mortality for pill 1 v mortality for pill 2) is ‘real’ or whether there a relationship
between two or more variables. In the latter case these relationships are often expressed in
terms of the dependent variable and one or more independent variables. The dependent
variable is the variable against which certain effects are measured. The independent
variables are those that are being tested to see what extent, if any, they have on the
dependent variable. For example, we may wish to test how media coverage of political
parties and electors’ past voting records determine an individual’s likelihood to vote for any
one political party. In this case, media coverage and past voting record are the independent

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
variables, and their effect would be measured in terms of the dependent variable, in this
case reported voting intention at the next general election.

The key to most significance testing is to establish the extent to which the null hypothesis
is believed to be true. The null hypothesis refers to any hypothesis to be nullified and
normally presumes chance results only –no difference in averages or no correlation between
variables. For example, if we undertook a study of the effects of consuming alcohol on the
ability to drive a car by asking a sample of people to perform basic driving skills while under
the influence of large quantities of alcohol, the null hypothesis would be that consuming
alcohol has no effect on an individual’s ability to drive.

In statistics, a result is said to be statistically significant if it is unlikely to have occurred by


chance. In such cases, the null hypothesis cannot be shown to be true. The most common
significance level to show that a finding is good enough to be believed is 0.05 or 5%. This
mans that there is a 5% chance of the observed data (or more extreme data) occurring given
that the null hypothesis is true. Where findings meet this criteria it is normally inferred that
the null hypothesis is false. While the 5% level is standard level used across most social
sciences, the 1% level (p<0.01) is also fairly common.

When a null hypothesis is rejected, but it is actually true, a so-called type I error has
occurred. In most cases this means that there is no correlation between the variables, but
the test indicates that there is. Much null hypothesis testing is aimed at reducing the
possibility of a type I error by reducing the p value and testing against a lower significance
level. The aim of this is to reduce the possibility of falsely claiming some connection; a ‘false
positive’ finding.

A type II error occurs when a false null hypothesis is accepted/not rejected. In most cases
this will mean that results are not down to chance alone, there is a correlation between the
variables, but the test did not detect this and gives a ‘false negative’ finding. The power of a
test is one minus the type II error rate and is the probability of correctly rejecting a false null
hypothesis (a ‘true positive’ finding). A higher power raises the chances that a test will be
conclusive. It is not common for the type II error rate or power to be calculated in
significance testing. Convention in many areas of social science especially is that type II
errors are preferable to type I errors. There is a trade off between type I and II errors as the
former can be reduced by setting a very low significance level (p<0.01 or p<0.001) but this
increases the likelihood that a false null hypothesis will not be rejected.

To revisit our example, a survey of 10 people asked respondents their current salary but also
their age, in order to investigate whether age (independent variable) has an effect on salary
(dependent variable). These were the results:

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Age Salary
18 £16,500
25 £19,300
31 £25,400
33 £23,200
39 £35,100
41 £46,000
49 £29,000
52 £38,200
58 £65,700
63 £49,200

Here our null hypothesis is that age has no effect on salary (and the significance level is 5%.
Using a simple linear regression 1 (of the type Income =β multiplied by age plus α) we get a β
value of £880 –income increased on average by £880 for each additional year of age. The p
value for this statistics was 0.0025. Thus the result is statistically significant at the 5% level
and we reject the null hypothesis of no connection between age and salary. While the actual
value of β is not always reported it helps the author start to establish importance or
substantive significance if they report it. Just as important is the confidence interval of the
estimate. The 95% confidence interval of β in this example is £410-£1,360, or we expect that
the true value would fall in this range 95% of the time. This tells the reader both about the
size of the effect and illustrates the level of uncertainty of the estimate.

There is a precise link between significance levels and confidence intervals. If the 95%
confidence interval includes the value assumed for the null hypothesis (here zero) then
p≥0.05 and the null hypothesis is not rejected at the 5% level. Similarly if the 99%
confidence interval included zero then the hypothesis would not be rejected at the 1% level.

The type of null hypothesis testing outlined in this note is that which most readers are likely
to find in the social sciences and medicine. The ritualised nature of some of this significance
testing, misinterpretation of results, non-reporting of the size of coefficients, focus on random
error at the expense of other sources of error, absence of alternative hypotheses and
ignorance of alternative types of significance testing has been criticism by some authors. 2
The most important criticism is the equivalence that some authors are said to see between
statistical significance and substantive significance. The sole focus on the presence of an
effect, not what size/how important it is. Statistical significance is not sufficient for
substantive significance in the field in question. It may also not be necessary in certain
circumstances.

1
See the Basic outline of regression analysis guide for more background
2
See for instance Gerd Gigerenzer, Mindless Statistics, The Journal of Socio-Economics 33 (2004)
587–606. http://www.mpib-berlin.mpg.de/en/institut/dok/full/gg/mindless/mindless.pdf

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Other statistical literacy guides in this series:
- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide 1
How to spot spin and inappropriate use of statistics
Last updated: July 2010
Author: Paul Bolton

Statistics can be misused, spun or used inappropriately in many different ways. This is
not always done consciously or intentionally and the resulting facts or analysis are not
necessarily wrong. They may, however, present a partial or overly simplistic picture:
The fact is that, despite its mathematical base, statistics is as much an art as it is
a science. A great many manipulations and even distortions are possible within
the bounds of propriety.
(How to lie with statistics, Darrell Huff)

Detailed below are some common ways in which statistics are used inappropriately or
spun and some tips to help spot this. The tips are given in detail at the end of this note,
but the three essential questions to ask yourself when looking at statistics are:

Compared to what? Since when? Says who?

This guide deals mainly with how statistics are used, rather than originally put together.
The casual reader may not have time to investigate every methodological aspect of data
collection, survey methods etc., but with the right approach they will be able to better
understand how data have been used or interpreted. The same approach may also help
the reader spot statistics that are misleading in themselves.

This guide is a brief introduction only. Some of the other guides in this series look at
related areas in more detail. There are a number of books that go into far more detail on
the subject such as How to lie with statistics by Darrell Huff, Damned lies and statistics
and Stat-Spotting. A Field Guide to Identifying Dubious Data, both by Joel Best and The
Tiger That Isn't: Seeing Through a World of Numbers by Michael Blastland and Andrew
Dilnot. The following websites contain material that readers may also find useful
• Channel 4 FactCheck
• Straight Statistics
• NHS Choices –behind the headlines
• UK Statistics Authority
• STATS – US research organisation with a mission to ‘improve the quality of scientific
and statistical information in public discourse and to act as a resource for journalists
and policy makers on scientific issues and controversies’

Common ways in which statistics are used inappropriately or spun

Lack of context. Context is vital in interpreting any statistic. If you are given a single
statistic on its own without any background or context then it will be impossible to say
anything about what it means other than the most trivial. Such a contextual vacuum can
be used by an author to help put their spin on a statistic in the ways set out in this guide.
Some important areas of context are detailed below:

All statistical literacy guides are available on the Library Intranet pages:
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
• Historical –how has the statistic varied in the past? Is the latest figure a departure
from the previous trend? Has the statistic tended to vary erratically over time? The
quantity in question may be at a record high or low level, but how long has data
been collected for?
• Geographical –is the statistic the same as or different from that seen in other
(comparable) areas?
• Population if the statistic is an absolute value –this is important even if the absolute
value is very large or very small. How does the value compare to the overall
population in question. What is the appropriate population/denominator to use to
calculate a rate or percentage? The actual choice depends on what you want to use
the rate/percentage to say. For instance, a road casualty rate based on the total
distance travelled on the roads (casualties per 1,000 passenger km) is more
meaningful than one based on the population of an area (casualties per 1,000
population). The rate is meant to look at the risk of travelling on the roads in different
areas or over time and the total distance travelled is a more accurate measure of
exposure to this risk than the population of an area.
• Absolute value if the statistic is a rate/percentage –what does this percentage
(change) mean in things we can actually observe such as people, money, crimes
operations etc? For instance, the statement “cases of the disease increased by
200% in a year” sounds dramatic, but this could be an increase in observed cases
from one in the first year to three in the second.
• Related statistics –does this statistic or statistics give us the complete picture or can
the subject be measured in a different way? (See also the section below on
selectivity) Are there related areas that also need to be considered?
• Definitions/assumptions –what are the assumptions made by the author in drawing
their conclusions or making their own calculations? Are there any important
definitions or limitations of this statistic?

A related area is spurious comparisons that do not compare like for like. These are
easier to pass off if there is minimal context to the data. Examples include, making
comparisons at different times of the year when there is a seasonal pattern, using
different time periods, comparing data for geographical areas of very different sizes, or
where the statistic has a different meaning or definition. Comparisons over a very long
time period may look at a broadly similar statistics, but if many other factors have
changed a direct comparison is also likely to be spurious. For instance comparing the
number of deaths from cancer now with those 100 years ago –a period when the
population has increased greatly, life expectancy has risen and deaths from some other
causes, especially infectious diseases, have fallen. These changes should be
acknowledged and a more relevant statistic chosen.

Selection/omission. Selecting only the statistics that make your point is one of the most
straightforward and effective ways in which statistics are spun. The author could be
selective in the indicators or rates they choose, their source of data, the time period used
for comparison or the countries, population groups, regions, businesses etc. used as
comparators. The general principle applied by authors who want to spin by selection is
that the argument/conclusion comes first and data is cherry picked to support and
‘explain’ this. Such an approach is entirely opposite to the ‘scientific method’ where
observations, data collections and analysis are used to explore the issue and come
before the hypothesis which is then tested and either validated or rejected.

Improvements in statistical analysis software and access to raw data (ie. from
Government surveys) make the process of ‘data mining’ much easier. This is where a
researcher subjects the data to a very large number of different analyses using different
statistical tests, sub-groups of the data, outcome measures etc. Taking in isolation this

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
can produce useful ‘hidden’ findings from the data. But, put alongside selective reporting
of results, it increases the likelihood of one or more ‘positive’ findings that meet a
preconceived aim, while other results can be ignored.

The omission of some evidence can be accidental, particularly ‘negative’ cases –studies
with no clear findings, people who tried a diet which did not work, planes which did not
crash, ventures which did not succeed etc.- as the ‘positive’ cases are much more
attention grabbing –studies with clear results, people who lost weight on the latest diet,
successful enterprises etc. Ignoring what Nassim Nicholas Taleb calls ‘silent evidence’ 2
and concentrating on the anecdotal can lead people to see causes and patterns
where, if the full range of evidence was viewed, there are none.

It is highly unlikely that every piece of evidence on a particular subject can be included in
a single piece of work whether it be academic research or journalism. All authors will
have to select to some degree. The problem arises when selection results in a different
account from one based on a balanced choice of evidence.

Charts and other graphics. Inappropriate or inaccurate presentation is looked at in


detail in the charts guide. Charts can be used to hide or obscure trends in underlying
data while purporting to help the reader visual patterns in complex information. A
common method is where chart axes are ‘adjusted’ in one form or another to magnify the
actual change or to change the time profile of a trend. Many charts in the print and visual
media are put together primarily from a graphic design perspective. They concentrate on
producing an attractive picture and simple message (something that will help their
product sell) which can be at the expense of statistical integrity. These aims are
compatible if there is input and consideration on both sides and there are examples of
good practice in the media. 3

Sample surveys are a productive source of spin and inappropriate use of statistics.
Samples that are very small, unrepresentative or biased, leading questions and selective
use by the commissioning organisation are some of the ways that this comes about. The
samples and sampling guide gives more background.

Confusion or misuse of statistical terms. Certain statistical terms or concepts have a


specific meaning that is different from that in common usage. A statistically significant
relationship between variables means that the observation is highly unlikely to have
been the result of chance (the likelihood it was due to chance will also be specified). In
common usage significant can mean important, major, large etc. If the two are mixed up,
by author or reader, then the wrong impression will be given or the meaning will be
ambiguous. If an author wants to apply spin, they may use the statistical term to give an
air of scientific impartiality to their own value judgement. Equally a researcher may
automatically assume that a statistically significant finding has important implications for
the relevant field, but this will not always be the case. The guide on statistical
significance gives further background.

A (statistically significant) correlation between two variables is a test of association. An


association does not necessarily mean causation, less still a particular direction of cause
and effect. The guide on Regression gives some advice on the factors to consider when
deciding whether an association is causal.

2
Nassim Nicholas Taleb, The Black Swan. The impact of the highly improbable. (2007)
3
See for instance some of the interactive and ‘static’ data graphics used by The New York Times. The
finance sections of most papers tend to contain fewer misleading or confusing charts or tables than the rest
of the paper.
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
Uncertainty is an important concept that can be lost, forgotten or ignored by authors.
Say, for instance, research implies that 60-80% of children who were brought up in one
particular social class will remain in the same class throughout their life. It is misleading
to quote either end of this range, even phrases such as “up to 80%” or “as few as 60%”
do not give the whole picture and could be the author’s selective use of statistics.
Quoting the whole range not only makes the statement more accurate it also
acknowledges the uncertainty of the estimate and gives a measure of its scale. As
statistician John W Tukey said:
"Be approximately right rather than exactly wrong."

Much social science especially deals with relatively small differences, large degrees of
uncertainty and nuanced conclusions. These are largely the result of complex human
behaviour, motivations and interactions which do not naturally lead to simple definitive
conclusions or rules. Despite this there is a large body of evidence which suggest that
people have a natural tendency to look for simple answers, see patterns or causes
where none exist and underestimate the importance of random pure chance. The
uncertainty and risk guide looks at this more fully.

Ambiguous definitions are another area where language can impact on the
interpretation of statistical facts. Ambiguous or incorrect definitions can be used, to make
or change a particular point. For instance migration statistics have terms for different
groups of migrants that use a precise definition, but the same terms are more
ambiguous in common usage. This can be used to alter the meaning of a statistic. For
instance, “200,000 economic migrants came to the UK from Eastern Europe last year”
has very different meaning to “200,000 workers came to the UK from Eastern Europe
last year”. Similarly mixing up terms such as asylum seeker with refugee, migrant,
economic migrant or illegal immigrant change the meaning of the statistic. George
Orwell, writing just after the end of the Second World War, said of the misuse of the term
‘democracy’: 4
Words of this kind are often used in a consciously dishonest way. That is, the person who
uses them has his own private definition, but allows his hearer to think he means
something quite different.

Averages. The values of the mean and median will be noticeably different where the
distribution is uneven (such as for incomes, wealth or a number of statistics relating to
age). The term average generally refers to the mean. Ideally the type of average should
be specified when the mean and median are known or thought to be different. This
potential ambiguity can be used by authors to select the average that better makes their
case. The measures of average and spread guide gives more detail.

An author may also use an ambiguous, subjective or personal definition of average to


mean ‘typical’ –such as the typical family, country or school. Such subjectivity gives
them scope to be selective and makes the definition of ‘average’ much less clear.

Rounding can help the reader better understand the quantities involved by not getting
lost in unnecessary detail or over precise numbers. However, using rounded numbers to
make further calculations can lead to incorrect answers if all or most of the rounding is in
the same direction. Calculations should be made on unrounded data and the results
rounded. Rounding can also be used intentionally to make a number seem smaller or
larger than it is. Both types become more of a problem with higher degrees of rounding
or rounding to fewer ‘significant places’. For instance 0.5 rounded to the nearest whole
number becomes 1; a 100% increase.

4
George Orwell, Politics and the English Language, Horizon April 1946.
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
Similarly through rounding it is possible to make it look like two plus two equals five:

2.3 + 2.3 = 4.6

But when each element is rounded to the nearest whole number it becomes

2 + 2 =5

The guide on rounding and significant places gives more background.

Reading too much into the data. Authors may draw conclusions from data that are not
fully supported by it. Their implication is that their conclusion naturally follows from the
statistical evidence. Again, this could be due to spinning or a misunderstanding of the
limits or meaning of the statistics and the logical steps involved in their argument.
“Places at UK medical schools have been cut by 10% so in a few years time there will be
a shortage of NHS doctors”. This statement uses the statistic to present the conclusion
as a fact. If one thinks logically about the statement it implies that the only factor that
affects the supply of NHS doctors is UK medical school places and that demand for
doctors will remain constant. In fact the supply of NHS doctors is affected by many more
factors including foreign-trained doctors coming to the UK, the age profile and retirement
rates of current doctors, the number leaving the profession (pre-retirement) or returning
to it, drop-out rates at medical school, medical students training in the UK and working
overseas, propensity to work in the private sector, changes in working hours, and so on.

It may be that the conclusion has some merit. But, if other contributory factors are
ignored the reader may not be aware of the imbalance between assumption and fact. If
assumptions about them are made but not mentioned by the author then the reader will
not be able to make their own judgements about them. This situation, as with many other
real world situations, is more complex. A (possibly unwitting) desire to view such factors
in simple black and white terms with straightforward causes and effects can lead to
inappropriate use of statistics.

A related inappropriate use of a statistic is to link data on the extent of a phenomenon to


a dramatic or shocking anecdote. While the anecdote is related to the phenomenon it is
selected as an example of the most shocking or dramatic. The most extreme cases are
normally the rarest. By linking the extent data to the extreme example the author could
be attempting to get the reader to imagine that all the cases of this phenomenon are as
extreme as the example given. In reality a large majority of the cases will be less
extreme and dramatic than the example given.

Many events and their related statistics vary around an average (mean) where the most
likely events are close to average and their occurrence becomes less and less likely the
further they are from average. Their distribution is said to be ‘normal’ or bell-shaped or
approximate to these. When an observation or result is particularly high compared to the
mean we expect the next one to be lower (closer to the mean) and vice versa. This is
known as regression to the mean. It does not always happen, but the most likely
follow-up to an extreme outcome is a less extreme one simply because most outcomes
are less extreme. Ignoring this effect means reading too much into the data by viewing
random variation as real change. Observations about regression to the mean have in
part been adopted into common terms such as ‘beginners luck’, the ‘difficult second
album’ or the ‘sophomore slump’ 5 . The particular problem with not recognising this

5
An alternative statistical interpretation for all three is that we remember the extreme results which are much
more ‘available’ –the band with the chart-topping debut album or the rookie footballer who sets a new
scoring record- and ignore the large majority of more average results –the moderate selling debut albums,
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
phenomenon is that observed changes are viewed as being real and linked to a
particular intervention where one has occurred.

Some observers have criticised analysis of the impact of speed cameras 6 and anecdotal
evidence of the success of alternative therapies 7 because they do not take regression to
the mean into account. On its own this does not completely invalidate any findings, it just
means that like should be compared with like –accident black spots with and without
speed cameras, people with similar illnesses who do and do not visit receive some sort
of alternative therapy. Regression to the mean can be difficult to observe and
disentangle from real changes to the mean. The box below illustrates an example where
there has been no aggregate change in the mean, but variations show a strong
indication of regression to the mean.

Key stage 2 test results and regression to the mean


Primary school attainment tables measure, among
other things, the proportion of pupils reaching the 40 Average change in 2008 by 2007 performance
expected level in English, maths and science. These band
percentages are added for each school and multiplied 30

by 100 to give an aggregate score (out of 300).


Nationally there was little change between the 20

aggregate score between 2007 and 2009.


10

The charts opposite break schools down by 2007


aggregate score into twenty equal sized groups. The 0

first chart looks at how the average performance


changed for each group in 2008. Clearly the poorest -10

performing schools in 2007 increased their aggregate


-20
score by most. This effect was smaller for each of the
next ten groups. In all of the top prior performing ten 40 Average change in 2009 by 2007 performance
groups average performance fell and it fell most for the band
best performing groups of schools. This strongly 30
suggests regression to the mean. Results changed
most where earlier results were most extreme. The 20

next chart underlines this. It looks at the same groups


of schools (2007 performance bands) and compares 10

average change in results from 2008 to 2009. There is


no clear pattern and average changes are small. 0

There is still regression to the mean at an individual


school level, but this is cancelled out for groups based -10

on 2007 performance because these groups have


already returned towards the mean. -20

Analysis of change in 2009 by 2008 performance bands shows a very similar pattern to that of the first chart.
Regression to the mean occurs after extreme results and so is identifiable when results are organised that
way. There are real differences between the performance of schools, but there is also substantial year to
year variation in results which displays a clear element of random variation.

Percentages and index numbers are a step away from their underlying numbers and
this can be used or forgotten by authors when interpreting them. Percentage changes in

bands who cannot get a deal or first year players who only make the reserves. When the next ‘observation’
is made for the successful debutants –sales of the next album or performances in the following year- it is
much more likely that they will perform less well or closer to average. So success is labelled beginners’ luck
and the second album is viewed as much more difficult.
6
Where a speed camera is placed in accident black spot its location has effectively been selected as the
more extreme. As we cannot directly observe the true long term mean for each potential site one
interpretation is that cameras are located in the ‘unluckiest’ places and we would expect fewer accidents in
the next period anyway as through regression to the mean they are likely to be less unlucky next time.
7
If someone visits a practitioner of alternative medicine when they are feeling unwell (an extreme event) and
feel better soon after (a more likely and less extreme event).
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
percentages or index values and the effect of compounding will not be understood by all
readers or authors. This can result in confusing or incorrect commentary by the author or
be used by them to spin a point of view. In medicine the relative risk reduction looks at
changes in the rate of mortality or morbidity from a particular treatment. This figure could
be published because the value will be greater than the absolute risk reduction which
looks at this change in the context of the entire population and is more meaningful. The
guide on uncertainty and risk gives more background.

The choice of base year for percentage change figures will affect the size of the
numbers involved. For instance, “this year’s imports are 100% greater than last year”
and “last year’s imports were 50% less than those for this year” mean exactly the same
thing. The same process can be used in index numbers to affect the change in index
values. The guides on index numbers and percentages give more background.

Statistics on money are particularly prone to spinning or misuse. Underlying changes


over time can be used or ignored selectively to give the author a more persuasive figure.
Any comparison of amounts in different time periods should be converted to a common
price base or real prices. Without this then any difference will be the result of underlying
inflation as well as the ‘real’ difference. The guide How to adjust for inflation gives some
practical advice on compiling a real terms price series and highlights some of the
potential misunderstandings around falling levels of inflation and falling prices. Confusion
between the two was common in late 2008 and early 2009 when inflation was falling, but
still positive and there was a general expectation that inflation would soon turn negative
and hence prices would actually fall. It is important to remember that prices are falling if
and only if inflation is negative.

Converting a financial time series to real prices may not give a complete picture on its
own. If it is funding for a specific service where the underlying demand is expected to
change, then a rate based on an indicator of that demand will give a more complete
picture. For instance “Funding is set to increase by 40% in real terms by 2010, this
means expenditure can rise from £100 per head to £140”.

Where the money covers a period of more than one year then a lack of clarity about the
actual values for individual years and the start and end point used to calculate headline
changes can leave room for ambiguity or spinning. For instance the phrase “there will be
a cash increase of £6 billion over the period 2005 to 2008”, could mean the 2008 level
will be £6 billion more than in 2005. It could also mean that the sum of annual increases
compared to 2005 is £6 billion (£1 billion in year 1, £2 billion in year 2 and £3 billion in
year 3). In this latter case the 2008 figure is £3 billion less.

Tips on how spot spinning or misuse of statistics


The list below gives a variety of ways to look at statistics to help identify spin or
inappropriate use, or cases where a doubt remains that can only be answered by looking
at the underlying data.

General questions to ask yourself


• What product or point of view is the author trying to ‘sell’?
• Are there any statistics or background that is obviously missing?
• Do the author’s conclusions logically follow from the statistics?
• Are comparisons made like-for-like?
• If there is any doubt about the original source of the statistic –Who created them and
how, why and when were they created?

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
If a really simplified version is needed then try:

Compared to what? Since when? Says who?

More specific points to look out for


• Statistics without any context, background or comparisons
• Totals without rates or without any comparators
• Percentages without any absolute values
• A case that is made without any consideration of contrary or inconclusive evidence
• An overly simplistic view about cause and effect
• Very large or very small numbers where the author assumes importance or lack of it
solely on this basis
• Records or hyperbole without any further context
• The term significant –assume it is the author’s interpretation of what constitutes
large/important unless it says statistically significant
• Ambiguous phrases such as ‘could be’, ‘as high as’, ‘at least’, ‘includes’, ‘much
more’ etc.
• Unspecified averages (mean or median) where you expect them to be different.
• Use of average for ‘typical’, the definition of which is known only to the author.
• Lack of details of surveys (sample size, source, actual questions asked etc.)
• Cut-down, uneven or missing chart axes
• Percentage changes in percentages, rates or index numbers
• Statistics on money that compare amounts in different time periods without using
real prices
• Statistics on money that do not spell out the time periods in question
• Over precision –intended to lend an air of authority
• Statistics that seem wildly unlikely or results that look too good to be true
• Data on things people are may want kept secret –the number of illegal immigrants,
drug use, sexual relationships, extreme views etc.
• Where has the data/analysis been published? For anything even vaguely scientific,
was it or the primary research published in a reputable peer-reviewed journal? This
does not make the work infallible, just less likely to be spun or contain inappropriate
use of data.
• Unsourced statistics

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
Statistical benchmarks
The author Joel Best 8 has suggested using statistical benchmarks to give the reader
some mental context when looking at other statistics. This can help to identify statistics
that seem wildly unlikely and those that appear to be questionable and where some
further investigation may highlight their actual limitations. Some of the latest (rounded)
benchmarks for the UK are given below:

Population: 61 million
Of whom
School age (5-15) 7.8 million
Working age 38 million
Pensioners 11.8 million
Minority ethnic group 4.6 million
(2001)

Live births: 775,000 per year


Deaths: 570,000 per year
Of which
Heart disease/stroke 190,000
Cancer 160,000
Respiratory illness 80,000
Accidents/suicide/ 20,000
homicide

GDP: £1,400 billion

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

8
Joel Best, Stat-Spotting. A Field Guide to Identifying Dubious Data (2008)
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The series is available on the Library Intranet or at: http://www.parliament.uk/topics/Statistics-
policyArchive.htm#SN
Statistical literacy guide
Index numbers
Last updated: September 2007
Author: Paul Bolton

What are index numbers?


Index numbers are a way of presenting proportionate changes in a common format.
Their most common use is looking at changes over time, but they are not restricted to
this and can be used to compare differences between local areas, groups of the
population etc.

Indices can be constructed from ‘simple’ data series that can be directly measured,
but are most useful for measuring changes in quantities that can not be directly
measured normally because they are the composites of several measures. For
instance, the Retail Prices Index (RPI) is the weighted average of proportionate
changes in the prices of a wide range of goods and services. The FTSE 100 index is
the weighted average of changes in share prices of the 100 largest companies listed
on the London Stock Exchange. Neither of these measures could be expressed
directly in a meaningful way.

Index numbers have no units. Because they are a measure of change they are not
actual values themselves and a single index number on its own is meaningless. With
two or more they allow proportionate comparisons, but nothing more.

How are they calculated?


The methods used to construct composite indices such as the RPI are complex and
include their coverage, how/when the base year is changed (re-basing), the relative
weighting of each constituent measure, frequency of changes to weighting and how
these are averaged. In such cases the base value may also reflect the relative
weights of its component parts. However, it is helpful to look at the construction of an
index from a ‘simple’ data series as it helps to better understand all index numbers.

Indices that look at changes over time need a base year; those that do not, normally
use an average figure for their base. If the raw data is a simple series of values then
the base year/figure is normally set to 100 and the rest of the series multiplied by
the same number (100 divided by the base value) to get the index numbers. Hence a
series 50, 60, 80 becomes 100, 120 (60 x 100/50), 160 (80 * 100/50) if the first figure
is set at the base value.

What do they mean?


100 is normally chosen as the base year/figure as it makes for the simplest
comparisons. Just take 100 away from the figure you are interested in and you are
left with the percentage change. The figures below could be a time series or a
non-time series index. Value B is the base value and equals 100.
A B C
77 100 123

If this were a time series then it is very clear that, compared to B, the value was 23%
less in period A (73-100= -23) and 23% more in period C (123-100= 23). Similarly if
this series looked at geographical variations with B as the national average then it is
also clear that region A had a value 23% below average and region C 23% above.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
The choice of base year is very important as it affects those comparisons that can
most easily be made and hence the usefulness of any particular index.

The next example shows how index numbers can be used to compare series. Here
1990 is set at the base year and values are given for three series A, B and C. They
might be presented in index form because their actual values are not easily
comparable (eg GDP for a range of large and small economies) or their units are
different (eg carbon dioxide emissions, energy use and GDP).
1990 1995 2000
A 100 105 110
B 100 102 104
C 100 110 120
Here the relative trends are clear –C increased by the greatest proportionate amount,
followed by A then B. Again we cannot say anything about their absolute values.

Why are they useful?


Index numbers are essential for measuring changes in quantities that can not be
directly measured. The examples above also illustrate that producing indices for
‘simple’ data can be useful because of their simplicity, focus on change and their
ability to illustrate variations in series that would otherwise not be readily
comparable.

The charts below illustrates this, they look at trends in passenger transport by car,
rail and air. The first chart tells us that transport by care was by far the most popular,
but because it is so far ahead it is difficult to discern trends in rail or air travel. The
second chart presents this in index form. The second shows very clearly the relative
change of each mode over this period.
Billion passenger Index (1970 = 100)
kilometres
800 500
Air
450
700
Cars
400
600
350
500 300

400 250
Cars
200
300
150 Rail
200
100
100
50
Rail
0 Air 0
1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003 1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003

Because index numbers are based on changes they can often be more up to date
than series with monetary values, for instance the average earnings index is updated
monthly while the Annual Survey of Hours and Earning is, as its name suggests, only
produced once a year. Similarly official statistics on energy prices and bills in
monetary units are updated once a year, while monthly index values for energy
prices are published as part of the RPI.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Potential problems Index (2004 = 100)
Problems can arise with interpreting index 120
values if their values are treated as actual 100
numbers. The guide on percentages
contains more details. One problem specific 80
Rail
to index number is the choice of base year. 60
This can have affect how useful the series is Cars

and/or how it is interpreted. The reasons 40


why 100 is most often used as a base value 20 Air
have been explained already, so it makes
sense to make the most frequently used 0
1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003
comparator the base figure. The chart
opposite is based on the same data as the earlier chart, but with the final year set to
100. While it illustrates fundamentally the same numerical data, the patterns look
different. It will generally be less useful because the base year is set at the end of the
series, backward facing changes (1990 was x% less than 2004) are most easily
made, while it is more intuitive to think of changes from an earlier to a later point
(chronologically).

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide
How to understand and calculate percentages
Last updated: January 2010
Authors: Lorna Booth (x4313) / Paul Bolton (x5919)

What are percentages?


Percentages are a way of expressing what one number is as a proportion of another
– for example 200 is 20% of 1,000.

Why are they useful?


Percentages are useful because they allow us to compare groups of different sizes.
For example if we want to know how smoking varies between countries, we use
percentages – we could compare Belgium, where 20% of all adults smoke, with
Greece, where 40% of all adults smoke. This is far more useful than a comparison
between the total number of people in Belgium and Greece who smoke.

What’s the idea behind them?


Percentages are essentially a way of writing a fraction with 100 on the bottom. For
example:
– 20% is the same as 20/100
– 30% is the same as 30/100
– 110% is the same as 110/100

Using percentages in calculations – some examples


• The basics – what is 40% of 50?
Example 1: To calculate what 40% of 50 is, first write 40% as a fraction – 40/100 –
and then multiply this by 50:

40% of 50 = (40/100) x 50 = 0.4 x 50 = 20

Example 2: To calculate what 5% of 1,500 is, write 5% as a fraction – 5/100 – and


then multiply this by 1,500:

5% of 1,500 = (5/100) x 1,500 = 0.05 x 1,500 = 75

In general, to calculate what a% of b is, first write a% as a fraction – a/100 – and


then multiply by b:

a% of b = (a/100) x b

• Increases and decreases – what is a 40% increase on 30?


Example 1: To calculate what a 40% increase is on 30, we first use the method
shown above to calculate the size of the increase – this is 40% of 30 = (40 / 100) x
30 = 0.4 x 30 = 12. As we are trying to work out what the final number is after this
increase, we then add the size of the increase to the original number, 30 + 12 = 42,
to find the answer.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Example 2: To calculate what a 20% decrease is from 200, we first calculate 20% of
200 = (20 / 100) x 200 = 0.2 x 200 = 40. As we are trying to work out what the final
number is after this decrease, we then subtract this from the original number, 200 –
40 = 160, to find the answer.

In general, to calculate what a c% increase on d is, we first calculate c% of


d = (c / 100) x d. We then add this to our original number, to give d + (c / 100) x d.

If we wanted to calculate what a c% decrease from d is, we would again calculate


c% of d = (c / 100) x d. We then subtract this from our original number, to give
d - (c / 100) x d.

How do I work out a percentage?


• The basics – what is 5 as a percentage of 20?
Example 1: To calculate what 5 is as a percentage of 20, we divide 5 by 20, and then
multiply by 100, to give (5 / 20) x 100 = 25. So 5 is 25% of 20. 1

Example 2: To calculate what 3 is as a percentage of 9, we divide 3 by 9, and then


multiply by 100, to give (3 / 9) x 100 = 33.3. So 3 is 33.3% of 9.

In general, to calculate what e is as a percentage of f, we first divide e by f and


then multiply by 100 to give (e / f) x 100. So e is ( (e / f) x 100 ) % of f.

• Percentage increases and decreases – what is the percentage increase


from 10 to 15?

To calculate the percentage increase from 10 to 15, we first work out the difference
between the two figures, 15 - 10 = 5. We then work out what this difference, 5, is as
a percentage of the figure we started with (in this case 10):

(5 / 10) x 100 = 0.5 x 100 = 50

This gives us the answer – there is a 50% increase from 10 to 15.

To calculate the percentage decrease from 50 to 40, we first work out the difference
between the two figures, 50 - 40 = 10. We then work out what this difference, 10, is
as a percentage of the figure we started with (in this case 50):

(10 / 50) x 100 = 0.2 x 100 = 20

This gives us the answer – there is a 20% decrease from 50 to 40.

In general, to work out the percentage increase from g to h, we first work out the
difference between the two figures, h - g. We then work out what this difference,
h - g, is as a percentage of the original figure (in this case g):
( ( h - g ) / g ) x 100 %

1
We can check our calculation by working out what 25% of 20 is:
25% of 20 = (25/100) x 20 = 0.25 x 20 = 5

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
To work out the percentage decrease from g to h, we first work out the difference
between the two figures, g - h. We then work out what this difference is as a
percentage of the original figure (in this case g):
( ( g - h ) / g ) x 100 %

What is the % button on a calculator?


Calculators have a shortcut “%” key. Use this, for example:
- to work out 40% of 50, by pressing 50 * 40 “%” to get 20
- to work out 5% of 1,500, by pressing 1500 * 5% to get 75.

What’s the % button on a spreadsheet?


A spreadsheet “%” allows you to format fractions as percentages, by multiplying by
100 and adding a “%” to the result. For example, if you had selected a cell containing
the number 0.25 and pressed the % button, it would then appear as 25%.

What are the potential problems with percentages?


If percentages are treated as actual numbers, results can be misleading. When you
work with percentages you multiply. Therefore you cannot simply add or subtract
percentage changes. The difference between 3% and 2% is not 1%. In fact 3% is
50% greater, 2 but percentage changes in percentages can be confusing and take us
away from the underlying data. To avoid the confusion we say 3% is one percentage
point greater than 2%.

Similarly when two or more percentage changes follow each other they cannot
be summed, as the original number changes at each stage. A 100% increase
followed by another 100% increase is a 300% increase overall. 3 A 50% fall followed
by a 50% increase is a 25% fall overall. 4

2
We can see this in an example – 2% of 1,000 is 20, and 3% of 1,000 is 30. The percentage
increase from 20 to 30 is 50%.
3
We can see this in another example – a 100% increase on 10 gives 10+10 = 20. Another
100% increase gives 20+20=40. From 10 to 40 is a 300% increase.
4
Again we can see this in an example – a 50% decrease on 8 gives 8 - 4 = 4. A 50%
increase then gives 4 + 2 = 6. From 8 to 6 is a 25% decrease.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide
How to read charts
Last updated: September 2007
Author: Paul Bolton

A well designed chart clearly illustrates the patterns in its underlying data and is
straightforward to read. A simple chart, say one with a single variable over time,
should make basic patterns clear. A more complex one should perform
fundamentally the same function, even if it has three or more variables, uses a
variety of colours/variables or is made up of multiple elements. In simple time series
charts these patterns include the direction of change, variability of trends, size of
change and approximate values at any one time. In other types of chart these
patterns might include the contribution of different parts to the whole, the scale of
differences between groups or areas, relative position of different elements etc. Such
patterns should be clear to anyone with at least basic numeracy skills. The
understanding is based on the broad impression that such charts give as well as
conscious analysis of individual elements of the chart. 1

This guide is not, therefore, aimed at such charts. Instead, listed below are some
common examples of charts where, either through design/type of chart, or the type of
data chosen, basic patterns are not always clear to all readers. Some suggestions
are given to help the reader better understand the data in these cases. These are
summarised at the end of this note.

Shortened value axis


A feature of many charts, especially in the press and on television, is a shortened
value axis. This is where instead of starting at zero the axis starts at a greater
value. The result is that the variations shown in the chart are magnified. So, for
instance, a chart of recent FSTE 100 values appears to show a series of dramatic
crashes and upswings, whereas in fact the changes shown are only a few
percentage points from peak to trough.

The charts below illustrate the same process. The one of the left has the full value
(vertical) axis; the one on the right is shortened. It is clear the second gives much
greater emphasis to a series that varies from its first value by less than 8%.

Estimated carbon dioxide emissions since 1990 Estimated carbon dioxide emissions since 1990
180
million tonnes of carbon equivalent 165 million tonnes of carbon equivalent

160
160
140

120
155
100

80
150
60

40 145
20

0 140
1990 1992 1994 1996 1998 2000 2002 2004 1990 1992 1994 1996 1998 2000 2002 2004

1
See the Chart Format Guide for more information on what makes a good chart.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
The first step to better understand the data is to spot a shortened value axis. In some
cases it will be indicated by a zig-zag ( ) at the foot of the axis. In others the only
way is to carefully read the axis values. The next step is to work out the relative
importance of changes shown. This may involve concentrating on the actual changes
and/or calculating approximate percentage changes. In the second chart above this
could be done by reading off the values and calculating the decline up to 1995 was
less than 10% of the 1990 level and there has been little change since.

Charts that give changes only


A number of different charts give rise to broadly similar issues. Charts that only look
at percentage changes give a partial picture of trends and overemphasise
proportionate changes –as with a shortened value axis. In addition, they separate
trends into their shortest time period thus concentrate reading at the elementary level
(one point to the next, rather than points further apart). This may be useful if the aim
of the chart is to do with trends in the rate of change, but as percentage changes are
compounded it can be very difficult to work out the overall change over the period
shown, or even whether the overall change was positive or negative (if there are
increases and decreases). The underlying data would be needed in most cases to
draw any conclusions about trends in absolute values.

Index charts
Charts that look at index values are similar, although more can be concluded about
overall changes. The chart title and/or axis should identify that the series is an index
rather than actual values. These charts may compare one or more indices over time
and concentrate on relative values. If the reader looks at values for individual years
they show how the series compares to its base value, not its actual value at that time.
If two or more series are used it is important not to conclude anything about the
values of different series from the index chart alone. If the series A line is above the
series B line it means that A has increased more than B since the base year, not
(necessarily) that A is greater than B. The guide in this series on index numbers
gives more background.

Compressed category axis


Time series charts that present data from irregular dates as if they were regular
effectively compress the category (horizontal) axis. The charts below illustrate the
impact this can have on the pattern shown by the chart. The example on the left
condenses the categories and appears to show a levelling off in recent years. The
chart on the right shows how it would look if it were presented with appropriate gaps
between years –a fairly constant increase in numbers over time. It also shows the
‘holes’ in the data. Again this needs to be identified by carefully reading both axes.
It will not always be possible to picture the ‘true’ pattern after this has been identified;
especially if a number of gaps have been ‘condensed’ (as below). But identifying this
does indicate that the pattern shown may not be the whole picture. In such cases it
may be necessary to get the underlying data where possible, or estimate its
approximate values where not.

Letters received, 1985 to 2004 Letters received, 1985 to 2004


10,000 10,000

9,000 9,000

8,000 8,000

7,000 7,000

6,000 6,000

5,000 5,000

4,000 4,000

3,000 3,000

2,000 2,000

1,000 1,000

0 0
This is1992
1985 one of a series
1996 2000 of statisti
2001 cal
2002literac
200y
3 guide
2004s put together by the
1985 1987 Social
1989 & General
1991 1993 Statistics
1995 1997 section
1999 2001 2003
of the Library. The rest of the series are available via the Library intranet pages.
This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
3-D charts
The chart illustrates one of the
main problems with 3-D charts; it 925

can be difficult to tell what the 1,000900 750


underlying values are. The point 800 650
of a chart is to illustrate patterns 700
500
rather than to give precise values, 600
500
but in this example it is difficult to 400
judge within 10% of the actual 300
values. Moreover, the 3-D effects 200

distract from the underlying 100


0
patterns of the data. There is no A B C D

easy way around this problem,


but if the chart has gridlines these can be used to by the reader to give a better
indication of values.

Multiple pie charts


Pie charts are commonly used to look at contribution of different elements to the
whole. They are also used to compare
these shares for different groups, Spending, by type, 2004 and 2005
areas or time periods. The example 2004 2005

opposite compares two different time £28 billion £32 billion

periods. Here the relative size of two £34 billion

areas (health and other) were smaller £60 billion

in 2005 than in 2004. This is despite


the fact that both their absolute values
increased. Their share fell because the
size of education increased by a
greater proportion. Pie charts are £52 billion
Education £65 billion
better at focussing on proportionate Health
Other
contributions from a small number of
elements and may give the wrong impression in a case such as this.

In some charts the size of the second pie may be changed in proportion to the
difference in the total. Even in these cases the pattern of the underlying data may not
be clear as people are generally less good at perceiving differences in area than in
alternatives such as length or position on a line.

Again the key to better understanding the data is to identify this problem by reading
the values. If only percentage figures are given a total may be added to the chart, if
not, there is no way of spotting this and the chart is misleading. If you know the
values or total then some assessment of relative changes can be made, but this is
not always straightforward.

Charts with two value axes


Charts with more than one series let the Class sizes in Primary Schools in England
% of pupils in large classes (left hand scale)
reader compare and contrast changes. 40% Average class size (right hand scale) 30

Two or more separate charts can make 35%


25
the series difficult to compare, but it is not 30%

always possible to present these data on a 25%


20

single axis, for instance when one is on 20% 15

absolute values and the other is a 15%


10
percentage figure. The example opposite 10%

shows such a case. The absolute data is 5%


5

plotted on the right hand axis, the 0% 0


1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
percentage figures on the left hand one. The key to interpreting such charts is to first
to identify which series is plotted on which axes (legend keys and clearly
distinguishable series help in this case) and second to look at relative patterns
rather than absolute values. In this example it is of no importance that in all but one
year the line was above the column values. It would be very easy to change the axis
scales to reverse this. The main message in this chart is that the percentage series is
more volatile and since 1998 has fallen dramatically while the absolute series has
fallen only marginally.

Scatter plots
Scatter plots are normally used Scatter plot of recycling rates and waste per capita, 2004-05
to look at the relationship 45%
between two variables to help 40%
identify whether and how they
are associated. They can also 35%
help identify ‘outliers’ from a 30%
general pattern. A visual display 25%
like this can only give a partial
20%
picture of the relationship,
unless the association is very 15%
strong (all the dots form a line) or 10%
very weak (dots distributed 5%
evenly). In practice virtually all
0%
scatter plots will be somewhere 0kg 100kg 200kg 300kg 400kg 500kg 600kg 700kg
in between and a purely visual
interpretation of the data can be like reading tea leaves.

The example above might appear to show a negative association (one indicator
increases when the other decreases) to some readers, but little or no association to
others. The only way to tell definitively is to read any accompanying text to see
whether the relevant regression statistics have been calculated. These should
include the slope of the line of best fit, whether the value of this is significantly
different from zero and the r-squared value. These tell the reader how changes in
one variable affect the other and the strength of any association. In the example
above a 10kg increase in waste per capita was associated with recycling rates that
were 0.2% higher. While the slope of the line of best fit was significantly less than
zero this simple model does not explain much of the pattern as only 7% of the
variation in one was explained by variation in the other. The guides in this series on
regression and confidence intervals and statistical significance give more
background.

Logarithmic scales
Occasionally charts appear that have a value axis that seems to miss values- instead
of, say, 10, 20, 30 etc. they progress 1, 10, 100, 1,000 etc. These are known as
logarithmic scales. These can be useful in interpreting proportionate changes in
series that show long-term growth or decline and are also used in analysis of various
natural phenomena. In such charts a conventional or linear axis would
overemphasise the proportional impact of changes in an increasing series and
underemphasise those in a decreasing series.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
The two charts below illustrate the same data, but the one on the left uses a
conventional value axis, the one on the right a logarithmic axis.

Terminal passengers at UK civil aerodromes Terminal passengers at UK civil aerodromes


250,000
millions millions
1,000,000

200,000

10,000
150,000

100,000
100

50,000

0
1
1950 1954 1958 1962 1966 1970 1974 1978 1982 1986 1990 1994 1998 2002
1950 1954 1958 1962 1966 1970 1974 1978 1982 1986 1990 1994 1998 2002

The chart with the conventional axis clearly shows the growth of aviation and the fact
that absolute increases in passengers were larger in the second half of the period. It
is difficult to draw any conclusions about rates of change during this period.

In a chart with a logarithmic scale a series that has a constant rate of change
(whether this is an annual doubling or an increase of 1% a year) will be shown as a
straight line. If the rate of change is increasing then the slope of line will increase or
bend towards the vertical. If the rate of change declines the slope will flatten or bend
towards the horizontal. A fall in the absolute value will still be shown as a downwards
slope. Therefore we can see in the second chart that the rate of increase in air
passengers was fairly constant up to the early 1970s. Since then it has increased at
a slower, but still noticeably steady, rate.

It is more difficult to get an accurate impression of absolute values in charts with


logarithmic scales. Their value lies in their presentation of proportionate changes.
Again the key to understanding the data is to read the axis and hence spot the
logarithmic scale. You should then concentrate on proportionate changes rather
than absolute changes/values.

Summary
• Look for any unusual features in the chart by closely studying its constituent
elements:
o Title, legend, value axis, category axis, units, data values
• Pay particular attention to gaps in axes/uncommon scales
• Any unusual feature(s) may mean you have to adapt the broad impression
you initially get from the visual elements of the chart and limit the type of
inferences you can draw
• Read any accompanying text before making any firm conclusions
• If all else fails try to get the underlying data

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Other statistical literacy guides in this series:
- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide 1
A basic outline of regression analysis
Last updated: March 2009
Author: David Knott & Paul Bolton

Social scientists are often interested in analysing whether a relationship exists between two
variables in a population. For instance, is greater corruption control associated with higher
GDP per capita? Does increased per capita health expenditure lead to lower rates of infant
mortality? Statistics can give us information about the strength of association, this can
sometimes help in deciding if there is a causal relationship, but they are not sufficient to
establish this on their own.

Relationships are often expressed in terms of a dependent or response variable (Y) and
one or more independent or describing variables (Xi). The dependent variable is the
condition against which certain effects are measured, while independent variables are those
that are being tested to see what association, if any, they have with the dependent variable.

A scatter plot of Y against X can give a very general picture of the relationship, but this is
rarely more than a starting point. Statisticians can establish the direction, size and
significance of a potential association between an independent and dependent variable
using a simple linear regression model.
ƒ Simple because one explanatory variable is being tested, rather than several
ƒ Linear because it is assessing whether there is a straight-line association between X
and Y
ƒ Model because the process is a simplified yet useful abstraction of the complex
processes that determine values of Y,

The typical notation for a simple regression model is in the form of an equation of a straight
line:
Yi = α + βXi + ε

Where Yi is the dependent variable, Xi the independent variable, and α is the intercept (i.e.
the value of Y when β equals zero). β is the regression coefficient – the slope of the line
that shows the increase (or decrease) in Y given a one-unit increase in Xi. These
parameters are estimated in order to reduce ε2 and produce a line of best fit. The elements
of this equation are illustrated below:

All statistical literacy guides are available on the Library Intranet pages:
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Y

Yi = α + βXi + ε

∆Y

∆X

β = slope =change in Y = ∆Y
change in X ∆X

α is the intercept
α
=value of Y when X is 0

A number of statistical tests can be conducted to measure the strength of the association.
Normally the most important model parameter to test is β the regression coefficient. As with
Confidence intervals and statistical significance the starting point is hypothesis testing.
Hypothesis:
H0 : β = 0 There is no (linear) association between X & Y in the population
Ha : β ≠ 0 There is some (linear) association between X & Y in the popn

When a researcher rejects the null hypothesis, what is being said is that probability of
observing a test statistic as large as that observed in the standard test statistic - with the null
hypothesis being true - is so small enough that the researcher feels confident enough to
reject it. This probability (often known as a significance level) is often 0.05, or 0.1, but can
be as low as 0.01 (in medical trials).

The Pearson product moment correlation coefficient (or simply the correlation coefficient)
is not a model parameter, but a measure of the strength of the association between the two
variables. The correlation coefficient is denoted r and can take values from -1 to 1. A
positive figure indicates a positive correlation –an upward sloping line of best fit, and vice
versa. If r=0 then there is a complete lack of linear correlation. In practice such extreme
results are unlikely and the general rule is that values closer to +1 or -1 indicate a stronger
association. Some examples are illustrated below.

r=1 r=-1 r=0.8 r=-0.8 r=0

R2, sometimes known as the coefficient of determination, measures the proportion of the
variance in Y that is explained by X. It has no sign, so values closer to 1 indicate a closer
association –that X is better at predicting Y. R2 is sometimes given as the sole or most
important measure of the association between X and Y and hence the usefulness of the
model. However, a model that has a high R2 is not likely to be useful if we can not reject the
null hypothesis β = 0. The interpretation of a particular value of R2 is not purely statistical. A
high value does not necessarily means that X causes Y or than it is a meaningful

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
explanation of Y. It depends on the nature of the variables being looked at. Associations in
the social sciences tend to have smaller R2 values than those in the physical sciences as
they are dealing with human factors that often involve more unexplained variation. The R2
value is not a measure of how well the model fits and a useful model can have a low value. 1

Association and causation


In his classic essay on causation Sir Austin Bradford Hill set out his views of the most important
factors to consider when deciding whether an observed statistical association is due to causation.1
These are given in descending order of importance:
1) Strength - What increase in cases of the potential effect or outcome is observed when the
potential cause is present? Strength here refers to differences in the instances of the effect
or outcome, not the statistical strength of any association which has to be ‘significant’ and
not down to chance before looking at a hypothesis of causation.
2) Consistency -Has the finding been repeatedly observed, by different people, at different
times and under different circumstances?
3) Specificity –How specific is the potential effect? Is it limited to particular groups? Is the
potential cause associated with other outcomes? A high degree of specificity can lend great
support for a causal hypothesis, but such clear, simple and distinct one-to-one relationships
are rare.
4) Temporality -In what order did the event happen? An effect needs to come after a cause.
5) ‘Biological gradient’ –Is the effect stronger where the potential cause is stronger (more
intense, longer duration of exposure etc.), a so-called dose-response curve?
6) Plausibility -Is there a plausible theory behind the hypothesis of causation?
7) Coherence -Does the hypothesis make sense given current knowledge and related
observations?
8) Experiment -Is there any experimental evidence specifically connected to the hypothesis?
9) Analogy –Are there any similar causal relationships?

Example: Primary Care Trust deficits and health allocation per head
You have obtained data for all Primary Care Trusts. Data include outturn as a proportion of
PCT turnover and health allocation per head of population. How can you tell whether there
is an association between the variables?

The scatter plot below suggests a possible positive relationship which makes some intuitive
sense, but the points do not appear to form anything like a straight line, so we can expect
that only a small proportion of the variation in Y is explained by X. We can not draw any firm
conclusions without calculating and testing the model parameters.

1
Austin Bradford Hill, The Environment and Disease: Association or Causation?, Proceedings of the Royal
Society of Medicine, 58 (1965), 295-300.
Reproduced at: http://www.edwardtufte.com/tufte/hill
This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Surplus (deficit) as a
% of turnover
4%

2%

0%

-2%

-4%

-6%

-8%

-10%

-12%

-14%
£1,000 £1,200 £1,400 £1,600 £1,800 £2,000
Allocation per head

Regression model:
Yot = α + βXall + ε

Where Yot is the PCT outturn expressed as surplus/deficit as a % of turnover, Xall the
allocation per head and α is the intercept (i.e. the outturn level if funding per head is zero).

Null hypothesis
H0 : β = 0 There is no (linear) association between PCT outturn and allocation per
head in the population

Alternative hypothesis
Ha : β ≠ 0 There is some (linear) association between PCT outturn and allocation
per head in the population

The model parameters and other regression statistics can be calculated in Excel (Tools;
Add-ins; tick Analysis Toolpak; then back to Tools; Data Analysis; Regression)

Regression output format should be similar to this.


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.30713875
R Square 0.094334211
Adjusted R Square 0.091325355
Standard Error 0.022163667
Observations 303

ANOVA
df SS MS F Significance F
Regression 1 0.015401074 0.015401074 31.35218091 4.85091E-08
Residual 301 0.147859674 0.000491228
Total 302 0.163260748

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept -0.065670268 0.010129681 -6.482955226 3.67695E-10 -0.085604247 -0.045736289
X Variable 1 4.07875E-05 7.2844E-06 5.599301823 4.85091E-08 2.64527E-05 5.51223E-05

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Interpreting the output
The coefficients column enables us to state the regression model. In this case:

Yot = -0.066 + 0.00004Xall + ε

Each £1 increase in health allocation is estimated to result in a 0.00004 increase in PCT


outturn as a proportion (%) of turnover. Alternatively a £100 increase in allocation per head
is estimated to result in an increase in turnover of 0.4 percentage points. This line is
illustrated in the next chart.

Surplus (deficit) as
a % of turnover
4%

2% y = -.066 +0.00004x

0%

-2%

-4%

-6%

-8%

-10%

-12%

-14%
£1,000 £1,200 £1,400 £1,600 £1,800 £2,000
Allocation per head

The test of statistical significance is to calculate a t-statistic (given in the regression output
above). The probability (P-value) of observing a t-test statistic as high as 5.6 gives the
answer to the hypothesis test:

H0 : β = 0 is <0.001 (P=0.0000005)

The probability that there is no linear association is extremely low so we can therefore
reject the null hypothesis. There is therefore some linear association between PCT
outturn as a % of turnover and health allocations per head. The output shows a positive
linear association between outturn and allocation per head among the population (what we
expected before the analysis). As the P-value is so low this relationship is significant at the
5% (or even 0.1%) level of significance. The 95% confidence interval for β is also given in
the regression output; 0.000026 to 0.000055. Alternatively the confidence interval of an
increase of £100 in allocation per head is 0.26 to 0.55 percentage points.

In this case R2 equals 0.09 – suggesting that 9% of the variation in PCT outturns as a
proportion of turnover are explained by changes in allocation per head. This is a low degree
of explanation and confirms the visual interpretation of a wide spread of results around the
regression line. There is a general positive association, but allocations per head explain little
of the variance in levels of deficit. The model therefore does not help to explain much of
what is going on here and is poor at predicting levels of deficit from allocations. This might
imply that other explanatory variables could be usefully added to the model. Where more
variables are added, this is called “multiple regression”.

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Other statistical literacy guides in this series:
- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section of the
Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide
Rounding and significant places
Last updated: July 2007
Author: Julien Anseau

Rounding is the process of reducing the number of significant digits in a number. This
can help make it easier to remember and use. The result of rounding is a "shorter"
number having fewer non-zero digits yet similar in magnitude. The most common uses
are where numbers are “longest” –very large numbers rounded to the nearest billion,
or million, or very long decimals rounded to one or two decimal places. The result is
less precise than the unrounded number..

Example: Turnout in Sedgefield in the 2005 General Election was 62.213 percent.
Rounded to one decimal place (nearest tenth) it is 62.2 percent, because 62.213 is
closer to 62.2 than to 62.3. Rounded to the nearest whole number it becomes 62%
and to the nearest ten it becomes 60%. Rounding to a larger unit tends to take the
rounded number further away from the unrounded number.

The most common method used for rounding is the following:


• Decide what units you want to round to (hundreds, thousands, number of
decimal places etc.) and hence which is the last digit to keep. The first
example above rounds to one decimal place and hence the last digit is 2.
• Increase it by 1 if the next digit is 5 or more (known as rounding up)
• Leave it the same if the next digit is 4 or less (known as rounding down)

Further examples:
i. 8.074 rounded to two decimal places (hundredths) is 8.07 (because the next
digit, 4, is less than 5).
ii. 8.0747 rounded to two decimal places is 8.07 (the next digit, 4, is less than 5).
iii. 2,732 rounded to the nearest ten is 2,730 (the next digit, 2 is less than 5)
iv. 2,732 rounded to the nearest hundred is 2,700 (the next digit, 3, is less than 5)

For negative numbers the absolute value is rounded.

Examples:
i. −6.1349 rounded to two decimal places is −6.13
ii. −6.1350 rounded to two decimal places is −6.14

Although it is customary to round the number 4.5 up to 5, in fact 4.5 is no nearer to 5


than it is to 4 (it is 0.5 away from either). When dealing with large sets of data, where
trends are important, traditional rounding on average biases the data slightly upwards.

Another method is the round-to-even method (also known as unbiased rounding).


With all rounding schemes there are two possible outcomes: increasing the rounding
digit by one or leaving it alone. With traditional rounding, if the number has a value less
than the half-way mark between the possible outcomes, it is rounded down; if the
number has a value exactly half-way or greater than half-way between the possible
outcomes, it is rounded up. The round-to-even method is the same except that
numbers exactly half-way between the possible outcomes are sometimes rounded up -
sometimes down. Over a large set of data the round-to-even rule tends to reduce the

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
total rounding error, with (on average) an equal portion of numbers rounding up as
rounding down. This generally reduces the upwards skewing of the result.

The round-to-even method is the following:


• Decide which is the last digit to keep.
• Increase it by 1 if the next digit is 6 or more, or a 5 followed by one or more
non-zero digits.
• Leave it the same if the next digit is 4 or less
• Otherwise, all that follows the last digit is a 5 and possibly trailing zeroes; then
change the last digit to the nearest even digit. That is, increase the rounded
digit if it is currently odd; leave it if it is already even.

Examples:
i. 8.016 rounded to two decimal places is 8.02 (the next digit (6) is 6 or more)
ii. 8.015 rounded to two decimal places is 8.02 (the next digit is 5, and the
previous digit (1) is odd)
iii. 8.045 rounded to two decimal places is 8.04 (the next digit is 5, and the
previous digit (4) is even)
iv. 8.04501 rounded to two decimal places is 8.05 (the next digit is 5, but it is
followed by non-zero digits)

The process of rounding results in a less precise figure. This explains why rounded
percentage totals do not always add up to 100 and why rounded absolute numbers do
not always sum to the total.

Rounding to n significant places is a form of rounding by handling numbers of


different scales in a uniform way. This gives a rule to where numbers are rounded to.
A digit that is closer to the start of a number is “larger” and therefore considered more
significant. The number of significant places tell us where to start rounding, from then
on the rules are the same.

Examples. Rounding to 2 significant figures:


i. 15,420 becomes 15,000 (the 1 and the 5 are the first two significant figures and
the next digit, 4, is less than 5)
ii. 0.001586 becomes 0.0016
iii. 0.5 becomes 0.50 (the trailing zero indicates rounding to 2 significant figures)
iv. 0.07051 becomes 0.071
v. 88,920 becomes 89,000

Rounding to 3 significant figures


vi. 15,420 becomes 15,400 (the 1, 5 and 4 are the first two significant figures and
the next digit, 2, is less than 5)
vii. 0.001586 becomes 0.00159
viii. 0.5 becomes 0.500 (the trailing zero indicates rounding to 3 significant figures)
ix. 0.07051 becomes 0.0705
x. 88,920 becomes 88,900

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Other statistical literacy guides in this series:
- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide
A basic outline of samples and sampling
Last updated: October 2006
Author: Ross Young

There are two types of samples – scientific and non-scientific samples. The best
statistical samples are those that are conducted scientifically.

Non-scientific samples are those where the cases to be sampled are selected for their
typicality or availability, and it is not clear from the results of surveys using non-scientific
sampling how they can be generalised to a wider population. For example, we wish to
survey blood donors about the reasons why they choose to donate blood. If we simply
interviewed all those blood donors giving blood when we visited the blood centre, then
this would be known as a non-scientific sample.

In scientific sampling, the probability of any person being selected as part of the
sample is known because, in order to select a scientific sample, a ‘sampling frame’ is
needed. A sampling frame is a list of all individuals in the population to be surveyed. A
common sampling frame used in official statistics is a list of addresses drawn, for
example, from the electoral register or the Postal Address File. Samples drawn on the
basis of a sampling frame are known as probability samples since the probability of
inclusion for any individual or household is known, and therefore it is possible to
generalise or make inferences about the wider population from the results of the survey.

There are several ways in which probability samples may be selected. The best or
‘purest’ form of probability sampling is the simple random sample, in which every
person in the population has a known and equal chance of selection. Simple random
samples are often generated by computer from address files or telephone directories,
and it is commonly used in market research or telephone sales campaigns. However,
cost and convenience considerations mean that researchers often use a modified form
of random sampling, where individuals in the population have a different (albeit known)
probability of inclusion. For example, we may choose to sample every 5th person in a list
of names (systematic sampling), or sample different groups in the proportion they exist
in the population as a whole (stratified sampling), or limit our sampling frame
geographically by selecting particular sampling points such as parliamentary
constituencies or postcode sectors (cluster sampling). Alternatively, we may use a
combination of all these sampling methods in a single survey.

It is good practice, when presenting the findings of a survey, for details of the sampling
method to be provided, usually as an appendix to the survey report.

However, the results of sample surveys may not reflect the situation among the whole
population. For example, if we undertake a survey of household income using simple
random sampling and visit 100 houses, we would have 100 results and we could
calculate the average (mean) household income among these households. Because we
have only taken a sample, we suspect that the mean household income from our survey
may be different from the mean income of the whole population because we have not

This sheet is of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The rest of the series are available via the Library intranet pages.
interviewed the whole population and there was a chance factor involved in who we
decided to sample. This chance factor is referred to ‘sampling error’ and, when
considering the results of sample surveys, it is important to measure the extent of
sampling error.

Sampling error will depend on two factors – the size of the sample and the extent of
variation in the indicator we are measuring among the population we are sampling.

The larger the sample size, the less chance there is of selecting cases with extreme
values of the indicator we are measuring (e.g. household income), and the better chance
we have of approximating the true value among the whole population. In other words,
surveys with larger sample sizes (e.g. 1,000 or 15,000 people) are more likely to tell us
something interesting about the whole population and, conversely, surveys with small
samples (e.g. 10 or 50 people) are less likely to produce results which can be
extrapolated to the population as a whole. The results of large sample surveys are less
likely to produce high levels of sampling error.

Again, it is good practice for details of the size of the sample to be provided when
presenting the findings of surveys, alongside an estimate of the degree of sampling error
to allow a calculation of how the survey’s results may deviate from the population as a
whole. For example, opinion pollsters often present their results as estimated vote
shares for parties A, B, and C as being plus (+) or minus (-) x per cent. +/-x% is an
estimate derived from a calculation of sampling error. The lower the sampling error the
more certainty we should have that the results they publish reflect the true voting
intention of the population as a whole.

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This sheet is of a series of statistical literacy guides put together by the Social & General Statistics section of
the Library. The rest of the series are available via the Library intranet pages.
Statistical literacy guide -Uncertainty and risk
Standard Note: SN/SG/4836
Last updated: May 2008
Author: Paul Bolton
Social & General Statistics

In this world nothing can be said to be certain, except death and taxes.
-Benjamin Franklin

Scientific knowledge is a body of statements of varying degrees of certainty -


some most unsure, some nearly sure, but none absolutely certain
-Richard P. Feynman

…the greatest geniuses and greatest works have never concluded.


-Gustave Flaubert

1 Uncertainty and risk


The concepts of uncertainty and risk are key to many areas of statistics and hence statistical
literacy. Some of the other guides in this series look at specific aspects of the measurement,
interpretation and presentation of uncertainty. This guide gives a general overview of these
concepts, highlights their connection to other aspects of statistical literacy and looks at the
communication and understanding of risk.

1.1 Single events


For the purposes of this guide an event or outcome is uncertain if there is a possibility that it
can or can not happen. If there is absolutely no possibility of it happening there is no
uncertainty –it will not happen. If it is absolutely certain that it will occur there is no
uncertainty –it will happen. Some events or outcomes which are certain become uncertain if
they are further defined by time or space. For instance it is certain that someone will
eventually die, but uncertain when this will happen.

Risk is a quantified uncertainty. It may be quantified by experience or theory. It is important


not to view the term as solely connected to negative outcomes. They can be positive, neutral
or negative. For instance it is uncertain whether a set of six specific numbers will be drawn in
the Lottery in any given week. It may be highly unlikely that this one combination will come
up, but it is possible, so the outcome is uncertain. The risk of this (positive) outcome can be
quantified by theory. Assuming that the selection process is unbiased then the risk that any
one set of six numbers will be drawn from 49 is one in 14 billion for each draw. Risk is
equivalent to probability, odds, likelihood etc. but the single term is used here as defined.

1.2 Quantities and frequencies


The concept of uncertainty can also be extended from individual events to quantities and
frequencies –for instance a population projection, a project’s cost estimate or the number of
days that it will rain in the next week. Here the uncertainty is usually more obvious as the
quantity is clearly being estimated or the associated event has not yet happened. When this
uncertainty is quantified then the result is normally a maximum and minimum range or a

1
central estimate ± an amount or percentage. The process for quantifying this range can also
be theoretical (for instance in sample survey results) or experience (what are the previous
highest and lowest costs of each element of the project?). Some of the latter type may not be
much more than best guesses. In such cases the ranges may not prove to be accurate, but if
they are best guesses the principle is the right one and they can be improved over time. At
the very least a range acknowledges there is some uncertainty associated with the quantity
in question.

1.3 Examples where uncertainty/risk is defined


Single events
Examples of risks associated with single events are less common. The most prominent ones
are for natural events. Most weather forecasts in the UK tend not to define risks for everyday
events such as rain. Phrases such as “scattered showers” are more common than “ ..a 20%
chance of rain in this region”. However, such quantified risks are behind weather forecasts
and so-called probability forecast and forecast that give a confidence range are produced by
the Met Office for some users. The Met Office uses risk assessments for its weather
warnings. If the risk of a severe or extreme weather event is thought to be 60% or more in a
certain area in the next few days then an early warning is issued. If the risk of an imminent
severe/extreme event is estimated at 80% or more than a flash warning is issued. More
information about how the Met Office approaches uncertainty can be viewed at:
http://www.metoffice.gov.uk/science/creating/daysahead/ensembles/dec_making.html

The Environment Agency assesses the risk of flooding for England and Wales. One of its
main outputs is the flood map which estimates the risk of flooding for every area. These are
placed into one of three categories: significant (risk in any one year is greater than 1 in 75),
moderate (greater than one in 200 but less than one in 75) or low (less than one in 200).
There is a fourth category of areas that can be described as not at risk of flooding, but as the
assessment cannot be 100% certain, these areas are said to have a risk of less than one in
1,000, or where flooding would only occur with a once in a millennium event. 1

Quantities and frequencies


Uncertainties in quantities or frequencies are much more common. Official population
projections available from the Government Actuary’s Department give a principal projection,
plus high and low variants for the main underlying assumptions –fertility, life expectancy and
migration. The high/low assumptions are based on an assessment of the possible range. The
gap between the highest and lowest combination of assumptions for England in 2031 is
around ±4 million on a principal projection of 60 million; ±7%. 2

Sample surveys can use a statistical technique (described below in 1.4) to produce
confidence intervals or a margin of error for the results they produce. These may not always
be especially prominent in political opinion polls, but with a typical sample size of 1,000 and
assuming it is a random sample the margin of error is given as ±3 percentage points. In other
words if a party received a rating of 40% from the sample we would expect its national rating
to be 37-43% if the sample were truly random. An article in 2008 for Ipsos MORI looked at
this margin of error and the seemingly widely varying opinion poll figures.

1
Flood likelihood explained, Environment Agency
http://www.environment-agency.gov.uk/subjects/flood/826674/829803/858477/839808/?lang=_e
2
2006-based population projections, ONS

2
Large uncertainties surround oil reserves mainly connected to the underlying geology, but
also to do with technological and economic factors. For estimation purposes ‘reserves’ are
defined as oil left in discovered fields that is technically recoverable and commercial. There
are three confidence levels within this category: ‘proven’ >90% chance of being produced,
‘probable’ >50% but <90% chance of being produced, and ‘possible’ which have a <50% but
still ‘significant’ chance of being produced. There are two categories less certain than
‘reserves’, these are ‘potential additional resources’ which are not yet economically/
technically producible and ‘undiscovered resources’ which are areas where oil might be. In
the UK estimates for both these categories are presented as a central figure with a lower and
upper range. There are clearly many layers of uncertainty in these assessments. The official
UK estimate of ‘reserves’ at the end of 2006 varies from 479 million tonnes for proven
reserves only to 1,254 million tonnes for the sum of proven, probable and possible reserves.
The range of potential additional resources is 61-422 million tonnes and the range of
undiscovered resources is 438-1,637 million tonnes. 3 This level of uncertainty exists for the
North Sea which is relatively well mapped and established. Such uncertainty has led some
commentators to question the estimates of certain countries/regions and others to question
the usefulness of such widely varying estimates.

Data on greenhouse gas emissions are commonly seen as definitive totals, but they are
estimates and despite continued development of the methodology and data sources they are
still subject to a degree of uncertainty. In the UK this has been estimated at ±2.1% for carbon
dioxide in 2005. Trends are also affected, but to a lesser degree as some of the uncertainties
are assumed to be correlated over time. The estimated change in carbon dioxide emissions
between 1990 and 2005 was -6.3%, the 95% confidence interval (the range within which we
can be reasonably confident that the ‘true’ value lies) was -3.7% to -8.9%. Ranges are larger
still for other greenhouse gases 4

The concept of uncertainty

Using the definitions set out above it should become clear that very few events in our daily
lives are truly 100% certain. This does not mean they are completely random, or that we
know nothing about the risk of them happening, only that we can not be entirely sure.
Accepting that there is some uncertainty involved is a necessary first step in investigating the
associated risk and considering alternative possible outcomes. Psychologist Gerd
Gigerenzer has called the belief that an event is absolutely certain (even if it is not) the
‘illusion of certainty’. 5 He has cited this as one of the causes of innumeracy. Such an
outlook leaves no room for considering alternative possibilities or how likely they are and is
associated with an overly simplistic in view of causes and consequences.

It is understandable that people should want certainty and predictability. Economists assume
that most people are risk averse and are happy to pay a premium for certainty, hence the
existence of the insurance industry. In the long run with insurance companies making profits
people will on average be worse off financially than without any insurance but people's
decisions are made about individual preferences, generally with a shorter time horizon in
mind and they value the 'certainty' they gain from insurance.

3
UK Oil and Gas Reserves and Resources, BERR Oil & Gas Directorate
https://www.og.berr.gov.uk/information/bb_updates/chapters/reserves_index.htm
4
UK Greenhouse Gas Inventory, 1990 to 2005: Annual Report for submission under the Framework
Convention on Climate Change, NETCEN. Annex 7
5
Reckoning with risk –learning to live with uncertainty, Gerd Gigerenzer

3
If one tends to view most events as certain/wholly predictable then estimates given as
ranges which acknowledge uncertainty could be viewed as overly complex, pedantic, too
vague or just wrong. A typical response might be ‘we must know the true answer.’ An
appreciation of the concept of uncertainty lends itself to an appreciation that many everyday
events are complex and can have an element of randomness. There are some things that we
do not know for any number of reasons (it is too early to tell, little or no research has taken
place, the event is in the future etc.). The quote at the start of this guide about the
uncertainties in science refers to natural and physical sciences –so called ‘hard’ science.
Uncertainties in social sciences and hence our daily lives are greater because they deal with
complex human behaviour, motivations and interactions which do not naturally lead to simple
definitive conclusions or rules.

There is an analogy when thinking about the causes of many events or their outcomes. We
are seldom 100% certain that there is a single cause of an event, or single certain outcome
from it. Causes and consequences are frequently complex and multifaceted and an approach
which outlines the possible ‘candidates’ and quantifies their relative importance can help to
address this.

One response to this uncertainty would be to say that nothing can be concluded unless you
can come up with definitive unambiguous answers with 100% certainty. This absolutist
approach would rule many activities including out all or most social science, much medicine,
insurance, many criminal prosecutions, weather forecasting etc. An approach that takes
uncertainty and complexity into account may not come up with such superficially appealing
conclusions or any definitive conclusions at all. The result of this approach should be
‘approximately right’. The estimated range may be deemed too large and it is nearly always
possible to improve the estimates and quantified risks, but the alternatives are saying nothing
in these important areas or being ‘exactly wrong’.

1.4 Statistical concepts and uncertainty


Given the lack of absolutes much of statistics aims to quantify exactly how likely an event is
given certain underlying assumptions about the factors involved. Key to this is an
understanding that some results could be the result of random variation. Significance
testing is used to establish how likely it is that a set of statistical results occurred by chance
and hence whether there is a relationship between the variables and if so how strong it is.
Various methods can be used to measure this strength and hence quantify the uncertainty
surrounding the results.

In statistics, a result is significant if it is unlikely to have occurred by chance. The most


common significance level, or decision rule, to judge the test results is 0.05 or 5%. If this is
met then the statistical finding that there is a relationship has at least a 95% chance (or risk)
of being true and there is a 5% (or less) chance/risk that it is false. In the terms used earlier,
the risk that we are wrong when we conclude that there is a relationship between the
variables at this level is 5% or less.

The actual probability will be calculated in a statistical test and it is this that determines the
level of certainty/robustness of the findings. This is compared to the significance level being
used. Sometimes smaller significance levels are used to show that findings are even more
robust and less likely to be due to chance. Consequently, a result which is "significant at the
1% level" is regarded as more robust than a result which is "significant at the 5% level". More
detail is given in the guide on Confidence intervals and statistical significance.

4
Confidence intervals are closely connected to significance testing. They are a standard way
of expressing the statistical accuracy or uncertainty of a survey-based estimate. Confidence
intervals normally take a survey or sample based estimate and, based on assumptions or
knowledge about the size/variability of the entire population, give a range for the ‘true’ value
in the entire population. If an estimate has a high error level, the degree of uncertainty is
greater, the confidence interval will be wide and we can have less confidence that the survey
results describe the situation among the whole population.

The most commonly used confidence interval is 95%. If a 95% confidence level is reported
then we can be reasonably confident that ‘true’ value from the whole population lies within
this range. Formally, if the sample survey was repeated the ‘true’ value for the whole
population would fall within the corresponding confidence intervals 95% of the time. Again
more detail is given in the guide on Confidence intervals and statistical significance.

2 Communication and understanding of risk


Different ways of presenting and communicating of risk can alter how the same underlying
data is perceived. None of the various alternatives are technically wrong, but their full
meaning may only be understood by those with a thorough understanding of probability
theory and an idea of the underlying data. Gigerenzer says that both the miscommunication
of risk and the inability to draw conclusions or inferences from known risks are elements of
innumeracy -‘the inability to reason about uncertainties and risk’. 6 He sets out methods to
help better communicate and understand relative risks and conditional probability.

Relative risks
Comparing two or more risks -whether they are entirely different events, the risk of at various
periods in time or the risk of an event before and after some other event has occurred- is
normally communicated by relative risks. In medicine, for instance, a common relative risk
would be the reduction in death rates from a certain disease after taking a particular drug. So
if 50 out of 1,000 people who did not have the drug died from the disease and 40 out of
1,000 people died who did have the drug then the relative risk reduction is 20%:

(50-40)/1,000 = 10 = 20%
50/1000 50

The alternative recommended by Gigerenzer is to look at the reduction in absolute mortality.


In this example 10 fewer people died out of the 1,000 who received the drug, hence the
absolute risk reduction is 10 divided by 1,000 = 1%. 7

The difference in the two percentage figures is large, but as the underlying numbers are
given it should be clear that both figures refer to the same underlying data, they are just
calculated in a different way. However, it is very uncommon to have all this data presented
clearly in the material which reaches the general public, either in the press or marketing from
companies. What reaches the general public is normally “this drug reduced deaths from this
disease by 20%”. Clearly with the choice of an absolute or relative risk reduction a body that
wanted to emphasise a larger change would chose a relative risk reduction which is never
smaller. 8 What is missing in the figure above is “20% of what”. The reader is unable to tell

6
ibid.
7
These principles apply equally to rates that increase after another event has occurred.
8
They are only equal when in the group with no treatment 100% died.

5
how important this drug is for the population in question without knowing the underlying
incidence or base rate. This makes comparisons with other treatments or diseases difficult as
it lacks context. The absolute risk reduction figure does not have this problem because the
base rate (here the mortality rate of 5%) is included in the statistic. It tells the reader that if all
the sufferers of this particular condition received this drug then the death rate from it would
fall by 1 in every 100 sufferers.

It may be clear to readers of the guide on percentages that the difference between absolute
and relative risk reductions is analogous to the difference between percentages expressed in
percentage and percentage point terms. In the above example the change is either a 20%
fall in the rate of mortality or a 1% point fall.

Conditional probabilities
Conditional probabilities are the probability or risk that event A will occur given that B has
already occurred. These can be used in many fields including legal cases, but medicine is
again common. For instance what is the risk that someone has a particular disease given
that then have had a positive result on a screening test? Gigerenzer states that many people,
including professionals in the relevant field, confuse the risk of A given B with the risk of B
given A or the risk of A and B occurring. His remedy is similar to replacing relative risk
reductions statistics with absolute risk reductions and involves replacing probabilities with
natural frequencies. This is done by illustrating all the underlying data in terms of so many
people per 100, or 1,000, 100,000 etc. While this may seem identical to using percentages, it
crucially means keeping in the same base quantity and avoids taking percentages of
percentages (as in conditional probabilities). Because the numbers involved are people it can
be easier to attach relevant categories to them and ensure that the totals match. There is
also less calculation involved for someone who is looking at the data as part of it has already
been done in the presentation of the numbers.

To illustrate the difference Gigerenzer uses the following data on breast cancer screening
and posed the question, if a woman has a positive mammogram what is the risk that she
actually has breast cancer? 9

Conditional probabilities Natural frequencies

Breast cancer affects 0.8% of the age Eight out of every 1,000 women in this
group in question. If a woman has breast age group have breast cancer. Seven of
cancer the probability is 90% that she will these eight will have had a positive
have had a positive mammogram. If a mammogram. 70 of the remaining 992
woman does not have breast cancer there without breast cancer would still have had
is a 7% probability that she will have had a positive mammogram
a positive mammogram

When this question was put to physicians a clear majority overestimated the risk from the
conditional probability data, seemingly confusing the risk of having cancer given a positive
mammogram with the risk of having a positive mammogram given they have cancer. A clear
majority gave an answer that was right or very close after reading the same data expressed
in natural frequencies. All the natural frequency data requires is to select the appropriate
categories, here seven women who had breast cancer and a positive mammogram divided
by the total number of positive test results (70+7), which equals 9.1%. With conditional

9
Reckoning with risk –learning to live with uncertainty, Chapter 4, Gerd Gigerenzer

6
probabilities fundamentally the same calculation has to be made in more steps as the base
rate (here 8%) needs to be put back into the question. The method in this case is:

The probability of having cancer and having a positive test given they have cancer
divided by the probability of having cancer and having a positive test given they have
cancer plus the probability of not having cancer and having a positive test given they
do not have cancer.

These steps are far from intuitive and need the correct percentage to be selected at each
stage. Conditional probabilities connected to screening tests in medicine are frequently
termed sensitivity and specificity. Sensitivity is the proportion of people who have the
disease who received a positive screening test result (90% in the example above). Specificity
is the proportion of people who did not have the disease and received a negative screening
result (93%) in the example above. Sensitivity is the true positive rate, specificity the true
negative rate. False positives (7% above) add up to 100% with specificity. False negatives
(10% above) add up to 100% with sensitivity. For a given test there is normally a trade-off
between sensitivity and specificity.

Using absolute risk reduction figures and natural frequencies rather than conditional
probabilities are both ways to better understand the risks in question and for people to make
informed decisions. For instance, in comparing different treatments, weighing the potential
benefits of a treatment against its costs (physical and financial) and in how professionals
communicate the implication of test results to patients. In each case there are uncertainties
involved and improved communication around the related risks and possible alternatives
means people are not only better informed, but they are better able to draw conclusions from
this information.

Other statistical literacy guides in this series:


- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

7
Statistical literacy guide 1

What is a billion? And other units


Last updated: January 2009
Author: Richard Cracknell & Paul Bolton

What is a billion?
What constitutes a billion is a source of occasional confusion. In official UK statistics
the term is now used to denote 1 thousand million – 1,000,000,000. Historically,
however, in the UK the term billion meant 1 million million – 1,000,000,000,000 - but
in the United States the term was used to refer to 1 thousand million. The US value
had, however, become increasingly used in Britain and the Prime Minister, Harold
Wilson confirmed in a written reply in 1974 2 that the meaning of "billion" would be
thousand-million, in conformity with international usage.

The Oxford English Dictionary explains why UK and US usage differed.

billion, purposely formed in 16th c. to denote the second power of a million


(by substituting BI- prefix for the initial letters), trillion and quadrillion being
similarly formed to denote its 3rd and 4th powers. The name appears not to
have been adopted in Eng. before the end of the 17th … Subsequently the
application of the word was changed by French arithmeticians, figures being
divided in numeration into groups of threes, instead of sixes, so that F. billion,
trillion, denoted not the second and third powers of a million, but a thousand
millions and a thousand thousand millions. In the 19th century, the U.S.
adopted the French convention, but Britain retained the original and
etymological use (to which France reverted in 1948).

Since 1951 the U.S. value, a thousand millions, has been increasingly used
in Britain, especially in technical writing and, more recently, in journalism; but
3
the older sense ‘a million millions’ is still common.

The text of the 1974 Harold Wilson PQ:

“Billion” (Definition)

Mr Maxwell-Hyslop asked the Prime Minister whether he will make it the


practice of his administration that when Ministers employ the word “billion” in any
official speeches, documents, or answers to Parliamentaty Questions, they will,
to avoid confusion, only do so in its British meaning of 1 million million and not in
the sense used in the United States of America, which uses the term “billion” to
mean 1,000 million.

The Prime Minister: No. The word “billion” is now used internationally to mean
1,000 million and it would be confusing if British Ministers were to use it in any
other sense. I accept that it could still be interpreted in this country as 1 million
million and I shall ask my colleagues to ensure that, if they do use it, there
should be no ambiguity as to its meaning. (HC Deb 20.12.1974 c711-2W)

All statistical literacy guides are available on the Library Intranet pages
1
HC Deb 20 December 1974 c711-2W
2
Oxford English Dictionary 1989 ed

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
This definition of a billion is now known as the short scale -where each new
term for a number above a million is one thousand times greater than the
previous one. The historical definition of a billion is now known as the long
scale -where each new term for a number above a million is one million time
greater than the previous one.

Other units
The table below gives a selection of other small and large numbers, their (short
scale) name, SI 4 prefixes and associatied scientific notations. The format of
scientific notations is 10n where n is the total number of zeros. Hence 109 is
1,000,000,000 or one billion. Where n is negative the number is less than one
and n refers to the number of decimal places.

SI prefix Scientific
Name name symbol notation Number
-9
Billionth nano- n 10 0.000000001
-6
Millionth micro- μ 10 0.000001
-3
Thousanth milli- m 10 0.001
0
One 10 1
3
Thousand kilo- k 10 1,000
6
Million mega- M 10 1,000,000
9
Billion giga- G 10 1,000,000,000
12
Trillion tera- T 10 1,000,000,000,000
15
Quadrillion peta- P 10 1,000,000,000,000,000

Hence if we are looking at energy where the basic unit is a watt hour (Wh) then
a trillion watt hours is known as a terawatt hour (TWh) which is equivalent to
1012 or 1,000,000,000,000 watt hours. A billionth of a metre is known as a
nanometre (nm) which is equivalent to 10-9 or 0.000000001 metres.

Increases or decreases in scale can quickly become hard to fully comprehend,


as can their cumulative impact. The 1977 short film Powers of Ten 5 illustrates
the impact of changes in scale from extremely large to extremely small
dimensions.

3
The Système International d’Unités or International System of Units. SI prefixes are applied to SI
base units such as metre, second or watt.
4
Charles and Ray Eames, Powers of Ten: A Film Dealing with the Relative Size of Things in the
Universe and the Effect of Adding Another Zero (1977)
http://www.powersof10.com/index.php?mod=watch_powersof10
http://uk.youtube.com/watch?v=A2cmlhfdxuY

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.
Other statistical literacy guides in this series:
- What is a billion? and other units
- How to understand and calculate percentages
- Index numbers
- Rounding and significant places
- Measures of average and spread
- How to read charts
- How to spot spin and inappropriate use of statistics
- A basic outline of samples and sampling
- Confidence intervals and statistical significance
- A basic outline of regression analysis
- Uncertainty and risk
- How to adjust for inflation

This is one of a series of statistical literacy guides put together by the Social & General Statistics section
of the Library. The rest of the series are available via the Library intranet pages.

You might also like