You are on page 1of 25

Chapter 5

Use and Abuses of Statistics


Learning Objectives
17.1

recognise different techniques in


survey sampling and the
basic principles of questionnaire
design

Remarks
The concepts of populations and samples should be
introduced.
Probability sampling and non-probability sampling
should be introduced.
Students should recognise that, in constructing
questionnaires, factors such as the types, wording and
ordering of questions and response options influence their
validity and reliability.

17.2

discuss and recognise the uses and


abuses of statistical methods in
various daily-life activities or
investigations

17.3

assess statistical investigations


presented in different sources
such as news media, research
reports, etc.

SECTION A
5.1 Simple Sampling and Data Collection
Q5.1 A student union wants to carry out a survey to find out whether the students prefer a party or a
variety show on Christmas Eve.

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(a) What is the population of the survey?

(b) Describe how simple random sampling can be used to select the sample.

Q5.2 State one advantage and one disadvantage of each of the following sampling methods.
(a) Stratified random sampling

(b) Systematic sampling

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 2

Q5.3 The student union of a university wants to study the level of satisfaction with the activities held by
the halls. Suppose 150 resident students are to be selected to form the sample. The following are two
suggested sampling methods.
Method 1
Select resident students randomly from each hall. The number of resident students selected from each hall
is in proportion to the number of resident students in that hall.
Method 2
Obtain the name list of the resident students in the university. Draw the required number of names
randomly from the list.

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(a) Identify the above sampling methods.

(b) Which method is more likely to give a representative sample? Why?

(c) State one disadvantage of using the method chosen in (b).

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 3

Q5.4 A quality controller of a factory wants to ensure the glass bottles produced have standard weights.
(a)

On a certain day, she selected the first 5 glass bottles produced in each hour to form the sample.
Identify the sampling method.

(b)

It is given that

3
of the glass bottles in the sample are overweight and the numbers of standard
10

(ii) Find the number of operating hours on that day.

Q5.5 A doctor wants to conduct a study on the relapse rate of lung cancer. The population of the survey is
861 patients who completed treatment 5 years ago. The patients in the population are grouped into
the following categories.
Male

Female

Smoker

294

231

Non-smoker

189

147

The doctor decides to select 205 patients for his study using stratified random sampling. Find the number
of patients in the sample who are
(a) male smokers,

(b) females,

(c) non-smokers.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 4

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

weight and underweight glass bottles are 19 and 9 respectively.


(i) Find the sample size.

Q5.6 A company has 264 male staff and 286 female staff. 150 staff are selected randomly in proportion to
the numbers of male and female staff to complete a questionnaire.
(a) Identify the sampling method applied.

(b) Find the number of male staff selected.

the selected female staff have masters degrees. Find the percentage of selected staff who have
masters degrees.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 5

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(c) From the responses to the questionnaires, it is found that 25% of the selected male staff and 50% of

Q5.7 A market researcher wants to find out the sales volume of VCDs in a store. The VCDs in the store
are grouped into the following categories.
Category

Documentary

Romance

Action

Horror

Cartoon

Comedy

Number of shelves

It is given that each shelf contains 100 VCDs. The following are two suggested methods for selecting a
sample of 200 VCDs.
Method 1
Assign a unique code to each VCD, say 1, 2, , 1800. Choose a random number from 1 to 9, say n, and
pick up the VCDs with codes n, n + 9, n + 18, until 200 VCDs are obtained.

(a)

Identify the above sampling methods.

(b)

If Method 1 is used and the random number selected is 7, find the codes of the 25th and the last
VCD selected.

(c)

If Method 2 is used to select the sample, suggest two other pairs of reasonable choices for the
number of shelves selected and the number of VCDs to be selected from each shelf.

Q5.8 A quality control manager wants to check the quality of products in a textile factory. On a certain day,
the number of products produced is recorded as follows.
Category
Number of batches

Ties

Suits

Trousers

Shirts

Sweaters

It is given that each batch contains 150 items and 10% of the population will be selected as the sample.
The following are two suggested methods for selecting the sample.
Answers written in the margins will not be marked.
Math(CP)-2016HKDSE-Book6A-Ch5
Page 6

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

Method 2
Select 8 shelves randomly and then select 25 VCDs randomly from each selected shelf.

Method 1
First, select several batches randomly. Then, choose several items randomly from each selected batch.

(a)

Find the number of items in the sample.

(b)

Identify the above sampling methods.

(c)

If Method 1 is used to select the sample, suggest a reasonable number of batches to be selected and
the corresponding number of items to be selected from each batch.

(d)

If Method 2 is used to select the sample, find the total number of trousers and shirts in the sample.

(e) Which method is more likely to give a representative sample? Explain your answer.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 7

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

Method 2
Select items randomly from each category of clothing. The number of items selected from each category
is in proportion to the number of items in that category.

Q5.9 An inspector wants to investigate the amount of bacteria in 100 boxes of facial tissues. He decides to
select 5000 pieces of facial tissues to form the sample.
(a) Which data collection method should be used to collect the data?

(ii) Systematic sampling

Q5.10 A senior executive officer wants to conduct a survey on overtime work by civil service workers. He
decides to take a sample of 315 out of 2520 staff. Suppose the staff are assigned staff numbers from
1 to 2520 respectively.
(a) Suggest how the following methods can be used to select the sample.
(i) Stratified random sampling

(ii) Systematic sampling

(b) If the required sample size is doubled, suggest how the method in (a) can be modified.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 8

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(b) Suggest how the following methods can be used to select the sample.
(i) Simple random sampling

(c) State one advantage of using each of the methods in (a) in this situation.

Q5.11 An education researcher wants to carry out a survey on the learning difficulties of S1 students in city
H. The S1 students in the population are grouped into the following categories.
Male

Female

Studying in CMI schools

4816

6192

Studying in EMI schools

2064

4128

(b)

Find the probability that a selected student is a male.

(c)

There are 86 secondary schools in city H and the education researcher decides to interview 1000
S1 students. Describe how the following methods can be used to select the sample.
(i)

Simple random sampling

(ii) Stratified random sampling

Q5.12 Read the following questions extracted from some questionnaires. State the flaws in the questions
and suggest how they can be improved.
(a) Do you like Michael Jackson or are you a fan of him?
Yes
No

(b) Dont you think there should be more food choices in the tuck shop of our
school?

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 9

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(a) Find the probability that a selected student studies in an EMI school.

Q5.13 Compare Q1 and Q2 below.


Q1 Many experts have warned that nuclear leakage would cause great damage to
the environment. Do you think that nuclear energy should be introduced to our
city?
Yes
No
No comments
Q2 It is known that nuclear energy will not cause any air pollution under normal
circumstances. Do you think that nuclear energy should be introduced to our
city?
Yes
No
No comments

5.2 Misuse of Methods in Statistical Surveys


Q5.14 Determine whether there are misuses of sampling and data collection methods in the following
cases. Make comments and suggestions if there are any misuses.
(a) The Head of the Space Museum wants to make a rough estimation of the daily number of visitors.
He decides to count the number of visitors on Sundays.

(b)

The manager of a company wants to conduct a staff health survey. He randomly selects some staff
for a face-to-face interview.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 10

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

What is wrong with these two questions? What do you expect to be the difference in the responses to
these questions?

Q5.15 Determine whether there are misuses of sampling and data collection methods in the following

Answers written in the margins will not be marked.

(b)

Households in Sham Shui Po are randomly selected and interviewed for a study on household
density in Hong Kong.

Q5.16 A university canteen owner wants to find out the reasons why there is a significant drop in the
number of student customers recently. Therefore, he conducts a survey on student customers
opinions on the canteen.
Plan A:
Plan B:

Hand out questionnaires to the student customers in the canteen at randomly selected times.
Distribute questionnaires to the students walking pass the canteen entrance at randomly
selected times.

Which of the above plans is more appropriate? Discuss briefly.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 11

Answers written in the margins will not be marked.

(a)

cases. Make comments and suggestions if there are any misuses.


The principal of a kindergarten wants to study the group interaction of her students. She randomly
selects some students from different classes to join a playgroup. She then observes the students
behavior in the playgroup.

Q5.17 A District Councillor conducted a survey in his district on citizens opinions on the increase in
property tax. The results are as follows:
Strongly agree

5%

Agree

12%

No comments

48%

Disagree

20%

Strongly disagree

15%

The District Councillor draws a conclusion for his survey.

83% of Hong Kong citizens disagree with the increase in property tax.

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

Based on the statistics and the sampling method, discuss the validity of this conclusion.

END OF SECTION A
Answers written in the margins will not be marked.
Math(CP)-2016HKDSE-Book6A-Ch5

Page 12

SECTION B
*Q5.1 A quality control manager wants to check the quality of the products in a leather factory. On a
certain day, the numbers of batches of different products made are recorded as follows:
Pairs of shoes

Belt

Purse

Wallet

Handbag

Suitcase

Card holder

Number of
batches

(b)

the difference between the number of pairs of shoes and the number of card holders in the sample,

(c)

the number of purses in the population,

(d)

the number of products in each batch,

(e)

the population size.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 13

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

It is given that all batches contain the same number of products. Stratified random sampling is adopted to
choose a sample for quality check and 15% of all the products are selected to form the sample. If the
sample contains 108 suitcases, find
(a) the number of wallets in the sample,

*Q5.2 All final year students in a college are required to take an oral examination. The examinations can be
taken in either the first or the second semester and there are two teams of examiners, team A and
team B. The distribution of the candidates taking the oral examination this year is as follows:
1st semester

2nd semester

Examined by team A

144

196

Examined by team B

96

284

The Head of the examination unit wants to read the examination reports of some candidates. Stratified

(b)

In each semester, if the passing rate in the sample is less than 60%, scoring adjustments will be
made to all the candidates. Refer to the following table and state, for each semester, whether scoring
adjustment is needed and explain your answers.
Number of candidates
in the sample who failed

1st semester

2nd semester

23

51

(c) In order to maintain consistency in scoring, the difference in the passing rates of the candidates
examined by two teams should not exceed 5%. Refer to the following table and state with
justifications, whether the scoring of the two teams are consistent.
Examined by
team A

Examined by
team B

33

41

Number of candidates in
the sample who failed

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 14

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

random sampling is used to select a sample from the 4 groups described in the table above. It is given that
24 candidates in the sample were examined by team B in the 1st semester.
(a) Find the sample size.

*Q5.3 Read the following questions extracted from some questionnaires. State the flaws in the questions
and suggest how they can be improved.

(b) Dont you disagree that drug testing should be introduced in primary schools or
secondary schools?
Yes
No

*Q5.4 A survey aims to find out the popular leisure activities for students in Hong Kong. 250 students
selected from a school are interviewed face-to-face by an interviewer. From the results of the survey,
the two most popular leisure activities are Reading and Learning.
(a) Determine whether there are misuses of sampling and data collection methods in this case.

(b) Suggest three possible ways to improve the accuracy of the above survey.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 15

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

(a) Food is Heaven is a Chinese idiom. Which basic necessity of life do you think
the Chinese are most concerned with?
Clothing
Food
Housing
Transport

*Q5.5 A company decides to test their new product, Slim yogurt, by distributing it to members of different
fitness centres every day for a month. After a month, it designs the following questionnaire and uses
it in a survey on the relationship between eating the product and weight loss.
1.

Did you eat Slim yogurt last month?


Yes
No (go to question 3)

2.

On average, how often did you eat Slim yogurt?


1 2 times per week
3 4 times per week
5 6 times per week
7 times or above
Have you lost weight when compared with a month ago?
Yes
No

Answers written in the margins will not be marked.

Answers written in the margins will not be marked.

3.

Comment on the following extract from the report of the survey.


In the survey, 1000 questionnaires were sent to the members of different fitness
centres. 382 completed questionnaires were collected.
Among the respondents, 95% of those who had Slim yogurt 5 times or more a
week lost weight, while only 5% of those who did not have Slim yogurt lost
weight. Therefore, eating Slim yogurt helps people lose weight.

END OF SECTION B
Answers written in the margins will not be marked.
Math(CP)-2016HKDSE-Book6A-Ch5

Page 16

Marking Scheme
This document was prepared for markers reference. It should not be regarded as a set of model
answers. Candidates and teachers who were not involved in the marking process are advised to interpret
its contents with care.
(6A05D001)
(a) The population of the survey is all the students in the school.
(b) Randomly select the required number of student ID numbers.
(6A05D002)
(a) Stratified random sampling is more likely to give a representative sample when the population

(b)

consists of several sub-groups with different characteristics. However, taking a sample from each
stratum is more time consuming and more expensive.
Systematic sampling ensures that the sample is selected across the population. However, periodicity
in the population may lead to an unrepresentative sample.

(6A05D003)
(a) Method 1: stratified random sampling
Method 2: simple random sampling
(b) Method 1. It ensures that resident students from each hall are included in the sample.
(c) It is more time consuming. / The follow-up analysis is more complicated.
(6A05D004)
(a) systematic sampling
(b) (i)

Let x be the sample size.

19 9
3
1
x
10
x 40
The sample size is 40.
(ii) Number of operating hours on that day
40

5
8
(6A05D005)
(a) Number of patients in the sample who are male smokers
294

205
861
70
Answers written in the margins will not be marked.
Math(CP)-2016HKDSE-Book6A-Ch5

Page 17

(b) Number of patients in the sample who are females

231 147
205
861
90

(c) Number of patients in the sample who are non-smokers

189 147
205
861
80

(6A05D006)
(a) stratified random sampling
(b) Number of male staff selected

264
150
264 286
72

(c) Number of selected staff with masters degrees


72 25% (150 72) 50%

57
The required percentage
57
100%
150
38%

(6A05D007)
(a) Method 1: systematic sampling
Method 2: simple random sampling
(b) Code of the 25th VCD selected
7 9(25 1)

223
Code of the last VCD selected
7 9(200 1)

1798
(c)

Select 5 shelves randomly and then select 40 VCDs randomly from each selected shelf. / Select 10
shelves randomly and then select 20 VCDs randomly from each selected shelf.
(or any other reasonable answers)

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 18

(6A05D008)
(a) Number of items in the sample
(2 3 4 5 2) 150 10%

240
(b) Method 1: simple random sampling
Method 2: stratified random sampling
(c) First, select 6 batches randomly. Then, choose 40 items randomly from each selected batch.
(or any other reasonable answers)
(d) Total number of trousers and shirts in the sample
150 (4 5) 10%

135
(e) Method 2. It ensures that products from each category are included in the sample.
(6A05D009)
(a) experiment
(b) (i) Select 500 boxes of facial tissues randomly, and randomly inspect 10 pieces in each of the
selected boxes. (or any other reasonable answers)
(ii) Inspect the first 5 pieces of facial tissues in each of the boxes.
(or any other reasonable answers)
(6A05D010)
(a) (i)

Randomly select

1
of the staff from each department.
8

(ii) Select a random integer between 1 and 8, say n, and select staff with staff numbers
n, n + 8, n + 16, , n + 2512.
(b) Stratified random sampling: Randomly select

(c)

1
of the staff from each department.
4

Systematic sampling: Select a random integer between 1 and 4, say n, and select staff with staff
numbers n, n + 4, n + 8, , n + 2516.
Stratified random sampling: More likely to give a representative sample / ensure staff from each
department are included in the sample.
Systematic sampling: Less time consuming/ less expensive
(or any other reasonable answers)

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 19

(6A05D011)
Total number of S1 students in the population
4816 6192 2064 4128

17 200
(a) The required probability
2064 4128

17 200

9
25

(b) The required probability


4816 2064

17 200

2
5

(c) (i)

Select 50 secondary schools in city H randomly and choose 20 S1 students randomly from each
of the selected schools for interview.

(ii) Randomly select the following numbers of students from the various categories.
Male

Female

Studying in CMI schools

1000

4816
6192
280 1000
360
17200
17200

Studying in EMI schools

1000

2064
4128
120 1000
240
17200
17200

(6A05D012)
(a) There are two parts to this question. The respondents may be confused if they want to answer yes
to one part but no to the other. The question is also difficult to analyse because we cannot be sure
for which part of the question the answers are. It can be rewritten into two questions as follows:
Do you like Michael Jackson?
Yes
No
Are you a fan of Michael Jackson?
Yes
No
(b)

The question is loaded. It assumes that the respondent supports having more food choices in the
tuck shop. It can be modified as follows:
In which aspect do you think the tuck shop of our school should improve?
Service Food quality Hygiene
Cost

Food choice

Others (Please specify: ____________)

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 20

(6A05D013)
Both questions are leading questions. A lower percentage of people will support the introduction of
nuclear energy if Q1 is asked.
(6A05D017)
(a) The number of visitors on Sundays is usually higher than on other days. This periodicity will lead to
an unrepresentative sample. Different days in a week should be selected to form the sample for the
estimation. For example, count the number of visitors every 2 days.
(b)

Health condition is a piece of sensitive and personal information. Staff may refuse to answer or not
tell the truth to the manager. Questionnaire should be used instead of face-to-face interview. In
addition, the reason for collecting the data must be explained clearly to the selected staff beforehand
and a declaration of secrecy must be made on the cover page of the questionnaire.

(6A05D018)
(a) With the presence of the principal, the group behaviour can change and thus give a biased result.
The principal should observe without being seen by the students.
(b) The sampling method is biased because the household density in Sham Shui Po is likely to be
different from other districts. Samples should be taken in other districts as well.
(6A05D019)
Plan B. In plan A, the target interviewees are the students patronizing the canteen. They cannot help the
canteen owner find out the reasons why other students do not eat at the canteen.
(6A05D020)
The conclusion is not valid. The District Councillor regards all the interviewees who did not give a
positive response as disagreeing with the increase in property tax. This is misleading. In addition, as the
interviewees were selected from his district, they may not be representative of all the citizens in Hong
Kong. Moreover, the sampling method and the sample size are unknown, non-probability sampling
method and small sample size might have been used and these would affect the validity of the conclusion.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 21

(6A05E001)
(a) Number of wallets in the sample

108

2
3

72
(b) The required difference

5
1
108 108
3
3
144
(c) Number of purses in the population

4
108 15%
3
960
(d) Number of products in each batch

960
4
240

(e) Population size


240 (5 1 4 2 4 3 1)

4800
(6A05E002)
Proportion of candidates selected to form the sample

24
96
1

(a) Sample size

1
(144 96 196 284)
4
180
(b) For the 1st semester:
Number of candidates in the sample

1
(144 96)
4
60
Passing rate in the sample

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 22

23
1 100%
60
2
61 %
3
60%

For the 1st semester, scoring adjustment is not needed.

For the 2nd semester:


Number of candidates in the sample

1
(196 284)
4
120

Passing rate in the sample

51

1
100%
120
57.5%
60%
For the 2nd semester, scoring adjustment is needed.
(c) Number of candidates in the sample who are examined by team A

1
(144 196)
4
85

The corresponding passing rate

33
1 100%
85
3
61 %
17
Number of candidates in the sample who are examined by team B

1
(96 284)
4
95

The corresponding passing rate

41
1 100%
95
16
56 %
19
The required difference
3
16
61 % 56 %
17
19
108
4
% 5%
323

The scoring of the two teams is consistent.

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 23

(6A05E003)
(a)

It is a leading question. Respondents are likely to answer Food. The question can be rewritten as
follows:
Which basic necessity of life do you think the Chinese are most concerned with?
Clothing Food Housing Transport

(b) There are three flaws in the question.


Firstly, it is a question with a double-negative. It is difficult to understand the actual meaning of the
question.
Secondly, the question is loaded, it starts with Dont you which assumes that the respondents
should agree with the statement that follows.
Lastly, there are two parts to this question: introducing drug testing in primary schools and
introducing drug testing in secondary schools. The respondents may be confused if they want to
answer yes to one part but no to the other. It is also difficult to analyse because we cannot be sure
for which part of the question the answers are.
It can be rewritten into two questions as follows:
Do you agree that drug testing should be introduced to schools?
Yes
No
If yes, in which of the following schools do you think drug testing should be
introduced?
Primary schools

Secondary schools
Universities

No comments

(6A05E004)
(a) The sampling method may be biased because the characteristics of the students selected may be
different from those in other schools.
Interview is not a suitable method to collect the data because some students may not be willing to
(b)

give honest responses.


As the population of the survey is all students in Hong Kong, the sample size should be increased
significantly.
Students from other schools should be selected to ensure that students with different characteristics
are involved in the survey.
Collect the data using anonymous questionnaires.
(or any other reasonable answers)

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 24

(6A05E005)
1.
2.

3.

The sample includes the members of fitness centres only, which leads to an unrepresentative
sample.
Members who had Slim yogurt 5 times or more a week are those who visit fitness centres
frequently. Since regular exercise may cause weight loss, we cannot be sure whether their weight
loss is a result of eating Slim yogurt.
Nothing is said about people who ate Slim yogurt less than 5 days a week.

4.

The background of the respondents (such as diet habits, psychological conditions and health
conditions) is unknown. This background information may also have an effect on weight loss.
5.
The sampling method of the survey is not mentioned. Non-probability sampling might have been
used and this would affect the validity of the conclusion.
6.
The response rate of the survey is quite low (only 38.2%), resulting in an unrepresentative sample.
(or any other reasonable answers)

Answers written in the margins will not be marked.


Math(CP)-2016HKDSE-Book6A-Ch5

Page 25

You might also like