Professional Documents
Culture Documents
Contents
Handling data checklist ............................................................................................ 3 Handling data 1 ........................................................................................................... 4 The handling data cycle: .......................................................................................... 4 Types of data ........................................................................................................... 4 Collecting data ......................................................................................................... 5 Displaying the data .................................................................................................. 6 Pictogram ............................................................................................................. 6 Bar Chart .............................................................................................................. 7 Why are the following diagrams misleading? .......................................................... 9 How to draw pie charts .......................................................................................... 10 Stem and leaf diagrams ......................................................................................... 11 Handling data 2 ......................................................................................................... 14 Mean, median and mode ....................................................................................... 14 Using frequency tables .......................................................................................... 16 Grouped frequency tables ..................................................................................... 18 Box and whisker diagrams .................................................................................... 20 Handling data 3 ......................................................................................................... 23 Scatterplots ............................................................................................................ 23 Correlation ............................................................................................................. 24 Lines of best fit ...................................................................................................... 26 Frequency polygons .............................................................................................. 29 Two-way tables ...................................................................................................... 32 Handling data 1 practice questions ....................................................................... 35 Handling data 2 practice questions ....................................................................... 40 Mean, median, mode and range ........................................................................ 40 mean from tables ............................................................................................... 42 mean from grouped data .................................................................................... 43 Box and whisker plots ........................................................................................ 44
Page 2
Page 3
Handling data 1
The handling data cycle:
Key words
Discrete Continuous Stem and leaf diagram Stemplot Data Pictogram Pie chart Bar chart Tally Frequency
Types of data
Statistics is a branch of mathematics that is concerned with the collection, representation and interpretation of data. There are different types of data: Data
Qualitative
Quantitative
Discrete counted
Continuous measured
Now try these 1 State whether each of the following is qualitative or quantitative data. If quantitative, state whether it is discrete or continuous. (a) The number of pupils in a class. (b) The colour of cars in a car park. (c) The time spent by a motorist waiting at a red traffic light. (d) The styles of womens dresses available in a chain store. (e) The number of votes received by the candidates in an election. (f) The club of each of the members of the England football team. (g) The number of players from a club who play football for England. (h) The mass of a new born baby. (i) The number of words on a page of a book. (j) The duration of a hockey match.
Page 4
Collecting data
Name 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Transport Time (mins)
Transport
Tally
Frequency
Page 5
A pictogram is a very simple-to-read way of presenting data. It is cheerful and it makes a powerful visual impact. Example Population of Great Britain Excluding Ireland (figures in millions) 1801 (10.5) 1851 (21.0) 1901 (37.0) EACH FACE REPRESENTS 1 MILLION PEOPLE
But it is not very accurate (how would you draw 0.1 of a face?), and it can be rather laborious if you are drawing by hand. Now try this 2 Draw a pictogram to represent the way in which students in your class have travelled to college
Page 6
Bar Chart
The bar chart is easier to draw than a pictogram and allows for greater accuracy. It is best drawn on graph paper. Lets say that we have recorded the colours of the shirts of 30 students in a class, with the following results (this is qualitative data): RED 15 GREEN 5 BLUE 5 BLACK 3 WHITE 2
This can be presented in a frequency diagram with the bars either VERTICAL or HORIZONTAL.
16 14 12 Frequency 10 Shirt colour 8 6 4 2 0 White Black Blue
Green Red 0 2 4 6 8 10 12 14 16
Shirt Colour
Frequency
Note:
1. 2. 3. 4. 5.
If the data are continuous, the bars should be next to each other. If the data are discrete or qualitative, the bars should be kept separate. Each bar must be of exactly the same width. The frequency scale must go up evenly and must start at 0. Everything should be clearly labelled. The bar can be plain, shaded or coloured.
Now try this 3 Draw a bar chart to represent the way in which students in your class have travelled to college
Page 7
Points 2 and 3 above are very important: you are not allowed to mislead the viewer by tampering with the scales and the bar widths. For example, say there was a by-election and the result was: LABOUR 19,800 CONSERVATIVE 14,500 LIB. DEM. 13,000 GREEN 11,000 Now look at this chart presenting the result: A first glance suggests that Labour has 4 or 5 times as many votes as Conservative and that Conservative has twice as many as Lib. Dem. This is because we have made the Labour bar look bigger by making it wider. And, because we have started our scale at 10,000 votes, the Conservatives 14,000 looks twice as large as the Lib. Dem.s 12,000. THIS IS NOT ALLOWED!
Page 8
1985 1986 1987 1990 1991
Other drinks
(c) Profits `
Milk 1987 (e) Sales 1988 1989 1990 (f) Cost of a car k
Sales take off All diagrams should: be clearly labelled and titled have the scales clearly identified have the frequency begin at zero have the units given have the scales going up in equal amounts.
Page 9
No of people 5 2 2 1 10
Step 4 Draw a circle and draw on the correct angles Step 6 Label the segments Grey Eyes Blue eyes Hazel eyes
Green eyes
Now try this 4 Draw a pie chart to represent the way in which students in your class have travelled to college
Page 10
68
68
L = 12 1 2 3 4 5 6 H = 71
n = 10 30 3 represents 33 marks This indicates the total number of data items
2 3 4 5 7 5 6 7 8 8
This is the 20 level 7 is a leaf on the 50 level: indicating the data value 57 This is a key, which enables you to translate the level and leaf into a data value
L indicates the lowest value and H the highest value. (The difference between H and L is the range) Note that at each level the leaves are ordered, increasing as it moves away from the stem. This makes it easier to find the middle (median) value. Note also that repeated data values (here 68) are recorded separately.
Page 11
Now try this 5 Complete the stem and leaf plot for the following data 10 11 12 14 21 22 24 45 45 47 48 55 56 L= 1 2 3 4 5
H= n=
1 1 represents 11
Using stem and leaf plots to compare data higher tier A stem and leaf plot helps us to compare visually two different but related data sets. To do this we need to construct a back-to-back stem and leaf plot. Example In a module 2 test the same 10 students in a previous example scored the following results 25 33 40 42 43 45 56 57 57 69
Add this data to the first stem and plot diagram (from the above example) to form a back to back stem plot. 5 3 5 3 2 0 7 7 6 9 n = 10 1 2 3 4 5 6 2 3 4 5 7 5 6 7 8 8 2 3 represents 23 marks scored n = 10
The back to back stem and leaf plot uses one stem but has two sets of leaves, one to the right and one to the left. Remember the leaves are ordered so that larger leaves are further away from the stem.
Page 12
By looking at the back-to-back stem and leaf plot we can see that the module 2 test was probably easier - or that the students were better prepared - as students scored better marks. Now try this 6 (Higher tier) The table below gives the annual rates of inflation for 10 countries in 1992 and 1991. Complete the back-to-back stem and leaf plot and comment on your results. Country UK Australia Canada France Germany Italy Japan Netherlands Spain USA 1992 % change 4.1 1.5 1.6 2.9 4.0 6.1 2.2 4.1 5.5 2.6 1991 % change 8.9 6.9 3.9 3.4 2.8 6.5 3.3 2.8 6.6 5.7
Page 13
Handling data 2
Mean, median and mode
There are three types of average: mean, mode and median. Mode This is the one that occurs the most.
Key words
Mean Median Mode Range Frequency
Median This is the one in the middle, when all the numbers have been put into numerical order. Mean This is the one that we normally think of when we are asked to find the average. Mean = Total of the scores No of scores This can be expressed mathematically as x = Where x represents the mean of x, number of values.
_
x
n
Range The range is also an important piece of information. It tells us how spread out the information is. Range = largest smallest Example Here are the scores that 5 people get for a test. 6 7 5 6 6
Find (a) the mean (b) the mode and (c) the median score and (d) the range (a) Mean = 6 + 7 + 5 + 6 + 6 = 30 = 6 5 5 (b) Mode: 6 occurs the most (3 times) so the mode is 6 (c) Median Step 1: put in numerical order: Step 2: identify the middle number: So the median is 6. (d) The range is 7 5 = 2.
Page 14
5, 6, 6, 6, 7 5, 6, 6, 6, 7
Now try these 7 1. Five other people take the same test and their scores are: 5 3 5 5 7
Find (a) the mean (b) the mode (c) the median score and (d) the range 2. The following are the midday temperatures over a week: 23 C 24C 25C 26C 20C 23C 27C
Find (a) the mean (b) the mode (c) the median score and (d) the range 3. Find the mean of these numbers: 200 400 200 100 100
Add five to each of the numbers now find the mean: 205 405 205 105 105
Subtract 10 from each of the original numbers and find the mean: 190 390 190 90 90
Add 23 to each of the original numbers. Can you find the mean without doing a calculation? Multiply the original numbers by 2 and find the mean. 400 800 400 200 200
Page 15
To find the range from this data we can see that the number of children ranged from 0 to 6, so the range would be 6 0 = 6. To find the mode, we have to look for the highest frequency, here it is 37. So most houses have 0 children, hence the mode is 0. To find the mean is more complicated. We need to find the total number of children and the total number of houses. To find the total number of houses we need to add up all the frequencies = 120. The best way to find the total number of children is to redraw the table vertically and add another column: No of children x 0 1 2 3 4 5 6 Totals Frequency f 37 23 34 18 5 2 1 f = 120 Children Houses f x 37 0 = 0 23 1 = 23 68 54 20 10 6 fx = 181
Find (a) the range (b) the mode and (c) the mean number of children per family.
Page 16
2. An agricultural researcher counted the number of peas in a pod in a certain strain as follows: No of peas 3 4 5 6 7 8 No of pods 5 5 20 35 25 10
Find (a) the range (b) the mode and (c) the mean number of peas in a pod.
Page 17
The problem that we have here is that we cannot multiply 2 0-400. We dont know exactly the life in hours for each bulb. So, we have to estimate its lifespan, by taking the midpoint of the group and use this to multiply the number of bulbs. The best way to do this is to redraw the table vertically with some extra columns: Hours 0-400 400-800 800-1200 1200-1600 1600-2000 Totals Midpoint x No of bulbs f 2 5 7 5 1 f = 20
To find the midpoint add the first and last value and divide by 2:
0 + 400 = 200 2
Page 18
Now try these 9 1. Andrew did a survey at the seaside for his science coursework. He measured the lengths of 55 pieces of seaweed. The results of the survey are shown in the table.
Length of seaweed (L cm) 0 < L 20 20 < L 40 40 < L 60 60 < L 80 80 < L 100 100 < L 120 120 < L 140 Frequency 2 22 13 10 5 2 1
Andrew needs to calculate an estimate for the mean length of the pieces of seaweed. Work out an estimate for the mean length of the piece of seaweed.
10 6 17
16 3.1 12.8
18.1 10.8
8.3 15.7
14 3.7
11.5 9.4
21.7 8
Tally
Frequency
(b) (c)
Write down the modal class interval. Calculate the mean time.
Page 19
Scale Example The midday temperature (in C) for 11 cities around the world are: 13 12 5 34 8 10 11 4 25 23 36
Draw a boxplot to illustrate this data. First, put the data into order and then locate the median, LQ and UQ 4 5 8 LQ 10 11 12 Median 13 23 25 UQ 34 36
10
15
20
25
30
35
40
Page 20
Now try these 10 1. The following boxplot shows the class scores on a GCSE Maths mock paper
10
20
30
40
50
60
Find: the median mark the lower quartile the upper quartile the inter-quartile range the highest mark the lowest mark the range of the marks
2. (a) The following are the shoe size of eleven children: 1 6 2 5 5 6 4 Draw a boxplot to illustrate the data using the scale below:
(b) The following are the times in minutes taken to evacuate a building over 15 different fire tests: 5 8 6 8 6 4 5 6 6 7 4 7 7 5 8
Page 21
Now try these 11 The stem and leaf diagram shows the ages of students in a maths group. L= 0 10 20 30 40 50 60 H= n= How many students are there in the class? How old is the oldest student? How old is the youngest student? What is the range? What is the modal age? What is the median age? Find the mean age. Draw a box plot to illustrate the data. 6 7 7 8 8 9 9 9 9 1 1 4 5 2 6
Finished early? Can you find 4 numbers that have a mode of 1, a median of 2 and a mean of 3? Now have a look at Handling data sheet 2 on page 40
Page 22
Handling data 3
Scatterplots
Scatterplots (sometimes called scattergraphs, scattergrams or scatter diagrams) are ways of displaying two variables. They can be used to see if there is some link between the two sets of data.
Key words
Scatterplot Scatter diagram Scattergraph Variable Correlation
Match each person to the correct number on the scattergraph Example A survey was carried out by a group of students in which the height and weight of each student was measured. The results were recorded in pairs (e.g. the student with height 164cm weighed 58.2kg). Height (cm) Mass (kg) 164 58.2 152 50.8 173 60.3 158 56.0 177 76.2 173 64.2 179 68.8 168 60.5
In order to display this data on a scatter graph, two axes are drawn, one for the heights and one for the weights. (It does not really matter which is which, but, as a general rule, the first set of data is recorded along the horizontal axis and the second along the vertical axis). Each point is plotted using the paired data as the co-ordinates, i.e. for the student with height 164cm and weight 58.2 kg, the co-ordinates are (164, 58.2). Scatter graph showing height/mass of students 80
mass/kg
Correlation
Looking at the scatterplot above we can see that there is a link between a persons height and their weight. In general the taller someone is the heavier they weigh. If there is a link we say that there is a correlation. There are three types of correlation: positive, zero and negative.
Positive
Zero
Negative
Positive When there is a positive correlation as the x value increases so too does the y value. An example of positive correlation might be the number of police cameras and the number of speeding convictions; or height and foot size; or the amount of time spent revising and the marks in a maths exam. Zero Zero correlation shows that there may well not be a link or relationship between the two variables. For example, IQ and height, or the amount of food eaten and the marks on a maths exam. Negative If the y value decreases as the x value increases then it has a negative correlation. An examples of a negative correlation might be the age of a computer and its value; or the amount of time spent watching TV and the marks in maths coursework. Strong, moderate and weak correlation The correlation can also be strong, moderate or weak. The diagrams below give examples of each for a negative correlation:
Strong
Moderate
Weak
Page 24
Now try these 12 1. The diagram shows three different types of scatter graphs.
Describe each of the different kinds of correlation. The diagrams represent these three situations: (a) the age of cars plotted against their value. (b) the number of rooms in a house plotted against the value of the house. (c) the age of adults plotted against their weight. Which diagrams represent each of the situations? 2. For each of the following decide if there is a correlation and what sort of correlation it might be: (a) (b) (c) (d) (e) (f) (g) (h) Number of people in a lift and the weight of the lift Shoe size of students and the number of brothers and sisters they have Marks in maths test and marks in a science test Speed of a car and the time it takes to stop Speed of a car and the time taken to travel 10 km Temperature of the room and the time taken for an ice cube to melt The height of a student and the time taken to do maths homework The time taken to revise for a test and the test results
Page 25
Now try these 13 Draw scatter graphs for the following data. State the type of correlation (none, positive, negative) and give some indication of the degree of correlation (strong, moderate, weak). (a) Mark on module 1: Mark on module 2: (b) Age (years) Price of car () (c) Shoe size Handspan 5 17 9 21 7 20 6 20 5 18 10 22 2 3250 5 1500 10 220 4 2400 8 1200 9 900 15 22 20 34 14 50 5 20 24 66 10 32
To draw the line of best fit we must first plot the data on a scattergraph and then draw the line of best fit by eye.
Page 26
We can then use the line of best fit to predict someones score in one paper if we know the score in another. For example, if a student scored 80 in the Algebra, what would the score be for Handling data? Go to the 80 on the algebra axis go up until it hits the line and read off the corresponding Handling data value (75 marks). Now try these 14 1. The scatter graph shows the height and mass of students. Draw a line of best fit on the graph. Use the line of best fit to estimate the height of someone who has a mass of 60 kg.
Page 27
Whats happening here? What type of correlation is it? Does it mean that the more lemons that are imported the fewer road fatalities?
Page 28
Frequency polygons
Ungrouped data Example The results of a survey of 100 households are given in the table. Number of people in 1 2 3 4 household Frequency 11 28 21 25 Draw a frequency polygon to represent these data.
Frequency Polygon to show no. of people in household
5 10
6 5
Now try this 15 1. The frequency distribution of the heights of some students is shown Height (cm) 130- 140- 150- 160- 170Frequency 1 6 13 10 2 Draw a frequency polygon to illustrate the data. NB 140- means 140 or more but less than 150
Page 29
Grouped data Similarly, for grouped data, the frequencies are plotted against the mid-point of the class interval and the points are joined with straight lines. Example The following table shows the heights of 65 people grouped into class intervals. Height (cm) 150<h 160 160<h 170 170<h 180 180<h 190 190<h 200 200<h 210 Frequency 4 7 15 47 6 1 (no. of people) In order to draw a frequency polygon, we first need to find the mid-points of these intervals. e.g. to find the mid-point of (150-160) add the two values together and divide by 2
We can then draw the frequency polygon, plotting the heights against the frequency, as shown below.
50 Frequency 40 30 20 10 0 140 150 160 170 180 190 200 210 220 Height (cm)
This frequency polygon has been completed by joining the first point to (145, 0) and the last point to (215, 0).
Page 30
Example The mock examinations results in Mathematics for two GCSE groups in two successive years are recorded on the table below. Mark Group 1 (% frequency) Group 2 (% frequency) i) ii) 1-20 5 7 21-40 12 26 41-60 35 48 61-80 28 9 81-100 20 10
Draw the frequency polygon for each group. Comment on the mock examination papers, assuming that the ability of the pupils was the same in each group.
Solution (i) In this example, in order to draw the frequency polygon, the percentage frequencies could be plotted against the class mid-points, which are: 10.5, 30.5, 50.5, 70.5 and 90.5. It is not really necessary to plot the points to such a high degree of accuracy so the values 10, 30, 50, 70 and 90 can be used. (ii) Group 2 appear to have been given a more difficult examination paper than Group 1 because a smaller number of people in group 2 obtained high marks in the examination. [The average mark in group 2 was lower than in group 1]
60 50
Group 1 Group 2
Frequency
40 30 20 10 0 0 10
20
30
40
50
60
70
80
90
100
Mark
Now try these 16 2. A teacher noted the absence rates of her maths class on Mondays and Fridays. The results are given on the table below.
No absent from class Monday Frequency Friday Frequency 0 3 0 1 6 2 2 6 2 3 7 3 4 4 4 5 4 6 6 3 3 7 0 0 8 0 7 9 0 3 10 0 2
(i) (ii)
Draw the frequency polygon for each day, using the same axes. Comment on the absence rates of the two days.
Page 31
Two-way tables
Completing two way tables This is a typical (incomplete) two-way table.
How many people buy 100 g tea bags? How many buy 100g packet tea? The table is part of a GCSE question. It reads: Bob carried out a survey of 100 people who buy tea. He asked them about the tea they buy most. The two-way table gives some information about his results. Complete the two-way table. To complete the table we have to look at rows and columns that only have one missing figure. Look at the first column. How many people in total have tea bags: 2 + 35 + 15 = 52 Look at the second column. How many have 200 g packet tea? 25 (in total) (20 + 0) = 5. We can now complete some of the table:
5 52 Looking at the first row we can now find out how many in total use 50 g tea: 2+0+5=7 Looking at the second row we can find out how many have 100g tea: 60 (35 + 20) = 60 55 = 5 Adding these we get:
Page 32
7 5 5 52 We can now find the missing total in the end column 100 (60 + 7) = 33 And the missing total in the bottom row: 100 (52 + 25) = 23 This only leaves the 200 g instant tea figure to find. We can do this in two ways a good way to check the figures: From the column: 23 (5 + 5) =13 From the row: 33 (15 + 5) = 13 The answers agree. The completed table will then look like:
7 5 5 52 13 23 33
Page 33
How many males went to France? How many Males went to Spain? How many Males were there in total? Complete the two-way table. 2.
Page 34
2.
Page 35
3.
Page 36
4.
5.
Page 37
6.
Page 38
7.
8.
Page 39
Page 40
Page 41
1.
2.
3.
Page 42
Page 43
Page 44