You are on page 1of 27

Data Handling

Collecting Data
Learning Outcomes

Understand terms: sample, population, discrete, continuous and


variable

Understand the need for different sampling techniques including


random and stratified sampling and be able to generate random
numbers with a calculator or computer to obtain a sample

Be able to design a questionnaire (taking bias into account)

Understand the need for grouping data and the importance of


class limits and class boundaries when doing so

DH - Collecting Data

Data Handling

Sample:
A sample is a subset of the population. 11A would be a subset of the
following populations year 11, senior pupils, pupils of St Marys
Population:
The total number of individuals or objects being analyzed; this quantity is
user defined. E.g. pupils in a school, people in a town, people in a postal
code.
Discrete:
A discrete variable is often associated with a count, they can only take
certain values usually whole numbers.
E.g. number of children in a family, number of cars in a street, number of
people in a class.

DH - Collecting Data

Data Handling

Continuous:
A continuous variable is often associated with a measurement, they can
take any value in given range.
E.g. height, weight, time.
Variable:
See discrete & continuous above.

DH - Collecting Data

Data Handling

Random Sampling:
In simple random sampling every member of the population is a given
number. If the population has 100 member , they will each be given a
number between 000 and 999 (inclusive) then 3 digit random numbers are
used to select the sample (ignore repeats)

Stratified Sample:
Often data is collected in sections (strata).
Eg. Number of pupils in a school. In selecting
such a sample data is taken as a proportion of
the total population. Here we should sample
twice as many people in year 10 than in
year 8.

Year

No. of Pupils

100

50

10

200

11

200

12

150

Total

700

Data Handling

DH - Collecting Data

Stratified Sample:

To obtain as sample of 70 pupils out of the 700, we construct the


following table

Year

No. of
Pupils

100

100

50

50

10

200

200

/700 = 2/7

100

/700 = 2/7 70 = 20

11

200

200

/700 = 2/7

100

/700 = 2/7 70 = 20

12

150

150

/700 = 3/14

100

700

Proportion of total No. of pupils to be sampled

/700 = 1/7

100

/700 = 1/7 70 = 10

/700 = 1/14

100

/700 = 1/14 70 = 5

/700 = 3/14 70 = 15
70

DH - Collecting Data

Questionnaires

1. Sample should represent population


2. Sample must be of a reasonable size to represent population
(at least 30) sample mean = population mean
3. Questions should:
i) be as short as possible
ii) use tick boxes
iii) avoid bias
iv) avoid leading questions

Additional Notes

Data Handling
Collecting Data
Learning Outcomes:
At the end of the topic I will be able to
Can
Do

Revise
Further

Understand terms: sample, population, discrete,


continuous and variable

Understand the need for different sampling techniques


including random and stratified sampling and be able to
generate random numbers with a calculator or computer
to obtain a sample

Be able to design a questionnaire (taking bias into


account)

Understand the need for grouping data and the


importance of class limits and class boundaries

Data Handling
Analysing Data
Learning Outcomes

Understand that in order to gain a mental picture of a collection


of data it is necessary to obtain a measure of average and range

Be able to determine the mean, median and mode for a set of


raw scores and an ungrouped frequency table

Be able to obtain the median and interquartile range for grouped


data from a cumulative frequency graph

Understand the advantages and disadvantages of each average


and measure of spread

DH - Analysing Data

Measures of
Central Tendency

Mean
Sum of all measures divided by total number of measures.

x n

everyone included
affected by extremes

Mode
Most popular / most frequent occurrence.
not everyone included

not affected by extremes

Median
Arrange data in ascending order; the median is the middle
measure. Position = (n + 1)
not everyone included

not affected by extremes

DH - Analysing Data

Measures of
Central Tendency

Examples
Calculate the Mean, Median and Mode for:
a) 3, 4, 5, 6, 6,

b) 2.4, 2.4, 2.5, 2.6

* Normal distribution is where the mean, median and mode are close
eg example b)

DH - Analysing Data

Frequency Distribution

The number of children in 30 families surveyed are surveyed.


The results are given below.
Calculate
a) The mean number
of children per family
b) The median

(No. of children)
x

(No of families)
f

10

Grouped Frequency
Distribution

DH - Analysing Data

Often data is grouped so that patterns and the shape of the distribution can be
seen. Group sizes can be the same, although there are no applicable rules.

Find the mean of:


Mark

Frequency (f)

30 34

40 49

14

50 59

21

60 69

9
f = 51

Midpoint (x)

fx

Cumulative
Frequency Curves

DH - Analysing Data

Find the median of the following grouped frequency distribution.


Length

Frequency

21 24

25 28

29 32

12

33 36

37 40

Cumulative
Frequency

Upper Limit

DH - Analysing Data

Cumulative
Frequency Curves

Median = Measure of central location

Q1 = (n + 1) = 8.25th 26
Q2 = (n +1) = 16.5th 30
th
Q = (n +1) = 24.75 33
3

Interquartile Range = Q3 Q1
= 33 26
=7

Q1 = 25th percentile
Q3 = 75th percentile

Cumulative frequency

Interquartile range = Measure of spread


= Q 3 Q1

Q3

Q2

Q1

Upper Limit

DH - Analysing Data

Additional Notes

Data Handling
Analysing Data
Learning Outcomes:
At the end of the topic I will be able to

Understand that in order to gain a mental picture of a


collection of data it is necessary to obtain a measure
of average and range
Be able to determine the mean, median and mode
for a set of raw scores and an ungrouped frequency
table
Be able to obtain the median and interquartile range
for grouped data from a cumulative frequency graph
Understand the advantages and disadvantages of

Can
Do

Revise
Further

Data Handling
Presenting Data
Learning Outcomes

Revise drawing of pie charts, line graphs and bar charts

Be able to present data using a stem and leaf diagram, determine


mean, Median and quartiles

Be able to draw a boxplot for a set of values and compare more than
one box and whisker plots with reference to their average, spread,
skewness

Be able to draw a histogram to represent groups with unequal widths

Know which diagram to use to represent data, the advantages and


disadvantages of each type.

DH - Presenting Data

Box & Whisker Plots

A box & Whisker plot illustrates:


a) The range of data
b) The median of data
c) The quartiles and interquartile range of data
d) Any indication of skew within the data

Q1

Q2

Q3

Scale

Scatter Diagrams

DH - Presenting Data

x
Negative Correlation
x y

Positive Correlation
x y

x
No Correlation
x & y are independent

* The closer the points, the stronger the correlation

Histograms

DH - Presenting Data

32 packages were brought to the local post office. The masses of the packages
were recorded as follows
Mass (g)

0 < m 30

30 < m 40

40 < m 50

50 < m 90

No of packages

10

12

With unequal class widths we draw a histogram.


There are 2 important differences between a bar chart and a histogram
1. In a bar chart the height of the bar represents the frequency.
2. In a histogram the x axis is a continuous scale.

Histograms

DH - Presenting Data

When the classes are of unequal width we calculate and plot frequency
density
Frequency Density = Frequency
Class Width

Group

Frequency

Class Width

0 < m 30

30

30 < m 40

10

10

40 < m 50

12

10

50 < m 90

40

Frequency
Density

Stem & Leaf Diagram

DH - Presenting Data

When data are grouped to draw a histogram or a cumulative frequency


distribution, individual results are lost. The advantage of grouping is that
patterns (distribution) can be seen. In a stem and leaf diagram individual
results are retained and the spread / distribution of the data can be seen.
Draw a stem and leaf diagram for the data:
10, 11, 12, 15, 23, 26, 29, 32, 33, 34, 35,36, 42, 43, 44, 56, 57
Stem
1
2
3
4
5

Leaf

DH - Presenting Data

Additional Notes

Data Handling
Presenting Data

Revise drawing of pie charts, line graphs and bar charts

Be able to present data using a stem and leaf diagram,


determine mean, Median and quartiles

Be able to draw a boxplot for a set of values and


compare more than one box and whisker plots with
reference to their average, spread, skewness

Be able to draw a histogram to represent groups with


unequal widths
Know which diagram to use to represent data, the
advantages and disadvantages of each type.

Can
Do

Revise
Further

You might also like