Professional Documents
Culture Documents
(iii) P (Plaintiff and Defendant are of opposite gender AND Judge and
Scribe are both girls) =
11 10 9 9 99
2
20 19 18 17 646
=
Some students did
not multiply the prob
by
4!
2!2!
.
Parts (ii) and (iii)
were poorly done.
Many used the P & C
method and ended up
with many mistakes.
Perhaps these
students should do
the question using the
fraction method.
5
9 11
1 3
20
4
3! 2 99
or
4! 646
C C
C
| |
=
|
\ .
P (Plaintiff and Defendant are of opposite gender Judge and
Scribe are both girls) =
99 11 9
646 38 17
=
4 The volume of milk in millilitres in bottles is normally distributed with mean and standard
deviation o . Measurements were taken of the volume in 900 of these bottles and it was found that
225 of them contained more than 1002 millilitres and 400 of them contained less than 998
millilitres. Show that 999 and 4.91 = = when corrected to 3 significant figures.
[3]
A bottle of milk is considered short-filled if it contains less than 995 millilitres. The bottles of
milk are packed into boxes of 20 for distribution to the supermarkets. A box is considered to be
accepted if it contains less than 2 short-filled bottles of milk. Calculate the probability that in a
randomly chosen batch of 15 boxes, more than 2 boxes are accepted.
[4]
4 Let X be the volume of milk in ml.
2
~ ( , ) X N o
225
P( 1002)
900
675
P( 1002)
900
1002 675
P( )
900
1002
0.67449 (1)
X
X
Z
o
o
> =
s =
s =
=
400
P( 998)
900
998 4
( )
9
998
0.13971 (2)
X
P Z
o
o
< =
< =
=
998.69 , 4.9128
999 , 4.91 (3sig fig) (Shown)
o
o
= =
= =
Students are expected
to define random
variables clearly and
state the relevant
distributions.
A few forgot to
change the > sign
in P( 1002) X > to
< before using
GC(invNorm).
For all intermediate
workings, students
should leave all
answers in 5 sig fig
or more.
Clear working to
solve the
simultaneous
equations are
expected to be shown
as the results are
given in the question.
6
Using the answer above, we have
2
~ (999, 4.91 ) X N
P(X < 995 ) = 0.20763
Let Y be random variable no. of bottles that are short-filled in a box of
20.
~ B(20, 0.20763) Y
P( 2) P( 1) 0.059401 Y Y < = s = = P(a box is accepted)
Let W be random variable no. of boxes that are accepted out of 15.
~ B(15, 0.059401) W
P( 2) 1 P( 2) 0.0557 W W > = s =
5
The mass of a randomly chosen watermelon follows a normal distribution with mean 1.20 kg and
standard deviation 0.5 kg .
Let W be the total mass of 12 randomly chosen watermelons.
(i) Find the probability that the total mass of 12 watermelons exceeds10 kg . [3]
(ii) Find the probability that the mean mass of 12 randomly chosen water melons is between 1.1
kg and 1.2 kg. [2]
John wants to ship 12 watermelons to his girlfriend who is working in Japan. The cost of shipping
is calculated as follows :
$108.80 for the first 10 kg and $8.60 per kg for any additional weight.
Assuming that the total weight of 12 watermelons is more than10 kg , the cost of shipping
108.80 8.6( 10) C W = + .
(iii) Find the mean and variance of C . [2]
(iv) Calculate the probability that the cost of shipping exceeds $180.90. [2]
5
(i)
Let X be the random variable mass of a watermelon in kg.
( )
2
~ N 1.20, 0.5 X
1 2 12
W X X X = + + +
( ) ( )
2
0.5 ~ N 14.4,12 N 14.4, 3 W =
Students are expected
to define random
variables clearly and
state the relevant
distributions.
( ) P 10 0.994 W > =
(ii) Let
1 2 12
12
X X X
X
+ + +
=
Common mistake :
7
2
0.5
~ N 1.20,
12
X
| |
|
\ .
( )
P 1.1 1.2 0.256 X < < =
Alternative method
( )
( )
1 2 12
1 2 12
P 1.1 1.2
12
P 1.1 12 1.2 12
P 1.1 12 1.2 12 0.256
X X X
X X X
W
+ + + | |
< <
|
\ .
= < + + + <
= < < =
3
~ N 14.4,
12
W
| |
|
\ .
Mean mass of 12
water melons is
denoted by
1 2 12
12
X X X + + +
, not
W .
(iii)
( ) ( ) ( )
( ) ( ) ( )
E E 108.80 8.6 10
E 108.80 8.6E 8.6E 10
108.80 8. 14.4 6
146.64
86
C W
W
= +
= +
=
= +
( ) ( ) ( )
( ) ( ) ( )
2 2
2 2
Var Var 108.80 8.6 10
Var 108.80 8.6 Var 8.6 Var 10
8 12 .6
221.
0 5
8
.
8
C W
W
= +
= + +
=
=
Some students did not
know how to use the
properties of mean and
variance.
(iv)
( ) ~ N 146.64, 221.88 C
( ) P 180.90 0.0107 C > =
Students are expected
to state the distribution
for C.
6 Major avalanches can be regarded as randomly occurring events. They occur at a uniform average
rate of 8 per year.
(i) Find the probability that more than 3 and at most 7 major avalanches occur in a 3-month
period. [2]
(ii) Find the probability that the total number of major avalanches in two separate periods of 1-
month and 4-month is 7. [2]
(iii) Given that the probability of at least one major avalanche occurring in a period of
n-month is greater than 0.995, find the least possible integer value of n. [4]
(iv) Taking a decade to consist of forty 3-month periods, using a suitable approximation, find the
probability that, in a particular decade, there are at least 10 of these 3-month periods during
which more than 3 major avalanches occur. [4]
6(i) Let X be the random variable the number of major avalanches in a 3-
month period.
~ X Po (2)
P(3 X 7) P(X 7) P(X 3)
0.142 (3sig fig)
< s = s s
~
Students are expected
to define random
variables clearly and
state the relevant
distributions.
Some students did
8
not even know how
to compute
P(3 X 7) < s by
expressing it as
P(X 7) P(X 3) s s .
(ii) Let R be the random variable the total number of major avalanches in
two separate periods of 1-month and 4-month
~ R Po
10
3
| |
|
\ .
P(R 7) 0.0324 (3sig fig) = ~
Students are expected
to define a new r.v.
and state its
distribution as the
mean has changed.
(iii) Let W be the random variable the number of major avalanches in a n-
month period.
W~ Po
2
3
n
| |
|
\ .
2
3
P(W 1) 0.995
1 P(W 0) 0.995
P(W 0) 0.005
0.005
2
ln0.005
3
7.947
n
e
n
n
> >
= >
= <
<
<
>
Least n = 8.
Alternative method (using GC press Y = ,
enter
1
poissonpdf ((2 / 3)X, 0) Y = , press 2
ND
, GRAPH. X rep n)
P(W 0) 0.005 = <
From GC,
when n = 7, P(W 0) 0.0094 0.005 = = >
when n = 8, P(W 0) 0.00483 0.005 = = <
Hence least n = 8.
Define new r.v.
The formula for
P(W 0) = can be
found in MF15.
(iv) Let Y be the random variable the number of 3-month periods in a
decade, during which more than 3 major avalanches occur
Y ~ B (40, 0.14288)
Since n is large , np =5.7152 > 5 and nq = 34.2848 >5
Then Y ~ N (5.7152, 4.8986) approximately.
. .
P(Y 10) P(Y 9.5)
0.0436 (3sig fig)
c c
> >
~
A few students
defined the r.v.
wrongly, they defined
Y as the no. of major
avalanches occurring
in 40 3-month
periods.
Students are expected
to justify why a
Normal
approximation can be
used.
Some answers are
inaccurate due to pre-
9
mature
approximation.
Many students did
not do continuity
correction.
7
The time, x hours, that a random sample of 50 adults of a certain age group spent surfing the
internet on a particular day was recorded. The results are summarized by
2
337, 2397. x x = =
(i) Find unbiased estimates of the population mean and variance. [2]
(ii) Given that the population mean is equal to the unbiased estimate of the population mean
obtained in part (i), find the value of a such that there is a probability of 0.05 that a sample
mean of 50 adults differs from the population mean by more than a.
[4]
State, giving a reason, whether any assumptions about the population are needed in order
for the calculations to be valid. [1]
7(i)
Unbiased estimate of the population mean , 6.74
50
x
x = =
Unbiased estimate of the population variance o
2
,
2
s
( )
2
2
2
1 1
1
1 337
2397 2.5637 2.56
49 50
x x
n n
(
=
(
(
= = ~
(
Formula for unbiased
estimate of
population variance
can be found in MF
15.
(ii)
Let X be the random variable the sample mean of 50 adults.
2.5637
~ N(6.74, )
50
X approximately by CLT.
Students are expectd
to use CLT to find
the distribution of
X in this case as the
distribution of the
population is not
stated as normally
distributed in the
question and n is
large .
The phrase sample
mean differs from
population mean
in the question tells
us that either X
exceeds or
exceeds X , hence we
are to use the
modulus function.
10
( )
( )
P 6.74 0.05
P( 6.74 ) P( 6.74 ) 0.05
P( 6.74 ) P( 6.74 ) 0.05
2P( 6.74 ) 0.05 (by symmetry)
0.05
P 6.74 0.025
2
6.74 6.2962 [invNorm(0.025,6.74, 2.5637 / 50)]
0.4438
Hence, 0.444 (3 s.f
X a
X a X a
X a X a
X a
X a
a
a
a
> =
< + > =
< + > + =
< =
< = =
=
=
= .)
Take note of the
following rule for
inequality :
or
Y a
Y a Y a
>
> <
Possible answers :
- The time spent by each of the 50 adults are independent of one
another.
- No assumptions are required. This is because we can apply the
Central Limit Theorem to establish the distribution of X since
n = 50 is large.
8 The table below shows the total population(P), the resale price index of flats(I) and the number of
non-Singaporeans(F) in Singapore from the year 2002 to the year 2009.
Year (Y) 2002 2003 2004 2005 2006 2007 2008 2009
Total
population
(millions) (P) 4.176 4.115 4.166 4.265 4.401 4.589 4.839 4.988
Resale price
index of flats
(I) 96.7 103.9 106.6 101.6 103.6 121.7 139.4 150.8
No. of Non-
Singaporeans
(millions) (F) 0.7931 0.7479 0.7534 0.7980 0.8755 1.0055 1.1967 1.2537
Data extracted from: http://www.singstat.gov.sg/stats/themes/people/hist/pop.xls and
http://www.hdb.gov.sg/fi10/fi10321p.nsf/w/BuyResaleFlatResaleIndex?OpenDocument
Sociologist A would like to investigate whether there is a linear relationship between the total
population(millions)(P) and the resale price index of flats(I).
(i) Sketch a scatter diagram for the variables P and I. [1]
(ii) Calculate the equation of the regression line of I on P. [1]
(iii) Calculate the product moment correlation coefficient between P and I. [1]
(iv) Estimate the resale price index of flats when the total population is 4.5 million, giving your
answer to 1 decimal place. Comment on the suitability of your estimate.
[2]
Sociologist B would like to find the relationship between the variables year(Y) and the number of
11
non-Singaporean(millions)(F).
(v) Sketch a scatter diagram for the variables Y and F. State, with a reason, which of the
following models fits the data for the variables Y and F.
F bY a = +
( )
2
2003 F b Y a = +
[2]
(vi) Calculate the values of a and b for the model you have chosen in (v). [2]
8 (i)
Students are expected
to label the axis of the
graph clearly.
(ii)
Using the GC
57.409 139.49 (5 s.f.)
57.4 139 (3 s.f. )
I P
I P
=
=
(iii) 0.956 r =
(iv) When 4.5 P = ,
4.5 139.49 118 7 9 5 .409 . I = = (1 d.p.)
The estimate is suitable since 0.956 1 r = ~ ,which shows that there is a strong
positive linear relation between P and I and P = 4.5 is within the range of the
given data.
Students are expected
to use the answer(eqn)
in at least 5 s.f. to
compute I. If not, there
would a loss of
accuracy.
Students are expected
to justify with both
reasons:
- 1 r ~
- P = 4.5 within
given data range
(v)
From the scatter diagram, there seems to be a quadratic relationship between F
and Y , hence ( )
2
2003 F b Y a = + is a better model for the given data.
Alternative method
For this question, it is
obvious from the graph
that the quadratic
model is a better fit.
Note : For other similar
P
I
Y
F
12
r between F and Y = 0.911
r between F and ( )
2
2003 Y = 0.987
Since r between F and ( )
2
2003 Y is closer to 1 when compared to r between F
and Y, ( )
2
2003 F b Y a = + is a better model for the given data.
questions which are not
obvious which of the
given models fits the
data, you are expected
to find the r value for
all the models (as in the
alternative method
shown).
(vi)
Using the GC
0.754 a = , 0.0151 b =
The End