Professional Documents
Culture Documents
n x x
n
i
i
, Mode = 2
Median = average of the 10
th
and 11
th
smallest observation = (2+2)/2=2
b) The distribution is right-skewed because the right-hand tail is longer.
c)
163 ) 2 5 3 4 4 3 6 2 5 1 (
2 2 2 2 2
1
2
n
i
i
x
,
317 . 1 19 / ] ) 55 . 2 ( 20 163 [ ) 1 /(
2 2
1
2
n x n x s
n
i
i
Problem 9: (d) Both Median and IQR are insensitive to outliers. However, both the mean
and the standard deviation are very sensitive to the outlying values, because their values
will be distorted greatly when averaging.
Problem 10: (a) IQR=Q
3
-Q
1
=245-145=100, hence the thresholds to define the outliers are
Q
3
+1.5IQR= 235+150=385 and Q
1
-1.5IQR=145-150=-5
Since all the data are within the two thresholds, there is no outlier in the data.
Problem 11: a) 715 . 5 9 / ] ) 10 ( 10 1294 [ ) 1 10 /( 10 , 10 10 / 100 10 / ) (
2 2
10
1
2
10
1
x x s x x
i
i
i
i
b) 422 . 5 10 / ] ) 10 ( 11 ) 100 1294 [( ) 1 11 /( 11 , 10 11 / ) 10 100 ( 11 / ) (
2 2
11
1
2
11
1
x x s x x
i
i
i
i
Problem 12:
Q
1
: (np/100)=(11)(25)/100=2.753=k.
=> Q
1
= 3
rd
smallest observation = 11.
Q
3
: (np/100)=(11)(75)/100=8.259=k.
=> Q
3
= 9
th
smallest observation = 15.
Median = (11+1)/2
th
smallest observation
= 6
th
smallest observation = 13.
IQR=15-11=4. The thresholds for the
outliers are therefore
Q
3
+1.5IQR= 15+1.5(4)=21 and
Q
1
-1.5IQR=11-1.5(4)=5
Based on the thresholds, we can see that
the data values 2 and 25 are outliers, with
the largest and smallest non-outlying values
given by 9 and 18.
Problem 13: a) is true because its a square root value. b) is false: s can actually be zero,
when all the data points have the same value. c) is false: With the presence of outliers, s is
not a good measure of spread. However, the interquartile range (IQR) is a better measure of
spread in that case, as it depends on the middle half of the data points only.
Page 3/3
Problem 14: a) Range = 6-0= 6
b) 52 6 4 , 25 . 1 8 / ) 6 4 (
2 2
1
2
n
i
i
x x , 375 . 2 7 / ) ) 25 . 1 ( 8 52 ( ) 1 /(
2 2
1
2
n x n x s
n
i
i
c) Range = 60, 3616 60 4 , 8 8 / ) 60 4 (
2 2
1
2
n
i
i
x x ,
058 . 21 7 / ) ) 8 ( 8 3616 ( ) 1 /(
2 2
1
2
n x n x s
n
i
i
Both the range and standard deviation are sensitive to outliers, with the presence of the
data point 60 increasing the values for both substantially.
Problem 15: Note that mean and mode are preserved under both translation and rescaling,
while standard deviation and range are preserved under rescaling only. Hence,
130 ) 13 ( 10 range 10 range , 40 ) 4 ( 10 10
122 2 ) 12 ( 10 2 mode 10 mode , 102 2 ) 10 ( 10 2 10
x y x y
x y
s s
x y
Problem 16: The sample mean =1. In this case, the alternative formula for variance is easier
to calculate, given by 56 . 365 9 / ) ] 1 [ 10 3300 ( ) 1 /(
2 2
1
2 2
n x n x s
n
i
i
. The original formula
for variance, ) 1 /( ) (
2
1
2
n x x s
n
i
i
, is much tedious to calculate in the problem.
Problem 17: 84 ) 21 ( 4 2
2 2 2
x y
s s , Range
y
=|-2|Range
x
= 2(10)=20.
Problem 18: (b) From the boxplots, IQR is roughly equal to 80-60=20. Hence the length of
the vertical bars cannot be longer than 1.5IQR=30, with (ii) violating the criterion. (i) is a
valid boxplot, with the absence of a vertical line suggesting that the largest 25% of the data
points are of the same value (=80!). (iii) is a valid boxplot with two outliers in the data.
Problem 19: a) Mean = 4.6375, Median = (1.2+1.8)/2=1.5
b) Q1 = average of 2
nd
and 3
rd
smallest values = (0.7+1.1)/2=0.9
Q3 = average of the 6
th
and 7
th
smallest values = (9.8+2.3)/2 = 6.05
IQR=Q3-Q1 = 5.15 => Lower Threshold = Q1-1.5(IQR)=0.9-9.075=-8.175
Upper Threshold = Q3 + 1.5(IQR)=6.05+9.075=15.125
Since 20 is the only data point outside the two thresholds, there is one outlier
(20) in the data set.
Problem 20: a) Stem-and-leaf plot right hand side
b) Q1: np/100=20(25)/100=5 which is an integer.
Q1 = average of the 5
th
and 6
th
smallest obs. = 2800+(38+41)/2=2839.5
Q3: np/100=20(75)/100=15 which is an integer.
Q3 = average of the 15
th
and 16
th
smallest obs. = (3323+3484)/2=3403.5
(Problem 20)