You are on page 1of 10

Jim Lambers

MAT 460/560
Fall Semester 2009-10
Homework Assignment 2 Solution
Section 1.2
3. Suppose j

must approximate j with relative error at most 10


3
. Find the largest interval in
which j

must lie for each value of j.


(a) 150
Solution We must have j

150,150 10
3
, or j

150 0.15, which yields the


interval [149.85, 150.15].
(b) 900
Solution We must have j

900 0.9, which yields [899.1, 900.9].


(c) 1500
Solution We must have j

1500 1.5, which yields [1498.5, 1501.5].


(d) 90
Solution We must have j

90 0.09, which yields [89.91, 90.09].


5. Use three-digit rounding arithmetic to perform the following calculations. Compute the ab-
solute error and relative error with the exact value determined to at least ve digits.
(a) 133 + 0.921
Solution j = 133.921 and j

= 134, so the absolute error is 0.079 and the relative error


is 5.90 10
4
.
(b) 133 0.499
Solution j = 132.501 and j

= 133, so the absolute error is 0.499 and the relative error


is 3.77 10
3
.
(c) (121 0.327) 119
Solution j = 1.673 and j

= 121 119 = 2, so the absolute error is 0.327 and the


relative error is 0.195.
(d) (121 119) 0.327
Solution j = 1.673 and j

= 1.67, so the absolute error is 0.003 and the relative error


is 1.79 10
3
.
(e)
13
14

6
7
2c5.4
Solution j = 1.95354 and j

= (0.9290.857),(5.445.4) = 1.80, so the absolute error


is 0.154 and the relative error is 0.0786.
1
(f) 10 + 6c
3
62
Solution j = 15.1546 and j

= 31.4 +16.3 0.048 = 15.1, so the absolute error is


0.0546 and the relative error is 3.6 10
3
.
(g)
(
2
9
)

(
9
7
)
Solution j = 2,7 = 0.285714, and j

= (0.222)(1.29) = 0.286, so the absolute error is


2.86 10
4
, and the relative error is 10
3
.
(h)

22
7
1
17
Solution j = 0.0214963 and j

= (3.14 3.13),(1,17) = 0, so the absolute error is


0.0215 and the relative error is 1.
9. The rst three nonzero terms of the Maclaurin series for the arctangent function are r
(1,3)r
3
+ (1,5)r
5
. Compute the absolute error and relative error in the following approxi-
mations of using the polynomial in place of the arctangent:
(a) 4
[
arctan
(
1
2
)
+ arctan
(
1
3
)]
Solution We have
4
[
1
2
(1,3)
(
1
2
)
3
+ (1,5)
(
1
2
)
5
+
1
3
(1,3)
(
1
3
)
3
+ (1,5)
(
1
3
)
5
]
4
[
1
2

1
24
+
1
160
+
1
3

1
81
+
1
1215
]
3.14557613168724.
Since the exact value of , to 15 signicant digits, is 3.14159265358979, it follows that the
absolute error is 3.983 10
3
and the relative error is (3.983 10
3
), 1.268 10
3
.
(b) 16 arctan
(
1
5
)
4 arctan
(
1
239
)
Solution We have
16
[
1
5
(1,3)
(
1
5
)
3
+ (1,5)
(
1
5
)
5
]

4
[
1
239
(1,3)
(
1
239
)
3
+ (1,5)
(
1
239
)
5
]
16
[
1
5

1
375
+
1
15625
]
4
[
1
239

1
40955757
+
1
3899056325995
]
3.14162102932503.
Since the exact value of , to 15 signicant digits, is 3.14159265358979, it follows that the
absolute error is 2.838 10
5
and the relative error is (2.838 10
5
), 9.032 10
6
.
2
11. Let
)(r) =
rcos r sin r
r sin r
.
(a) Find lim
a0
)(r).
Solution If we substitute r = 0, we obtain 0,0, which is an indeterminate form. Using
lHospitals Rule three times, we obtain
lim
a0
)(r) = lim
a0
rcos r sin r
r sin r
= lim
a0
cos r rsin r cos r
1 cos r
= lim
a0
rsin r
1 cos r
= lim
a0
sin r rcos r
sin r
= lim
a0
cos r cos r + rsin r
cos r
= 2.
(b) Use four-digit rounding arithmetic to evaluate )(0.1).
Solution We have
)(0.1) =
(0.1) cos 0.1 sin 0.1
0.1 sin 0.1

(0.1)(0.995) 0.09983
0.1 0.09983

0.0995 0.09983
0.00017

0.00033
0.00017
1.941.
(c) Replace each trigonometric function with its third Maclaurin polynomial, and repeat
part (b).
Solution The third Maclaurin polynomial for cos r is 1
1
2
r
2
, and the third Maclaurin
polynomial for sin r is r
1
6
r
3
. Substituting these polynomials for cos r and sin r in
)(r), we obtain the function
)
3
(r) =
r
[
1
1
2
r
2
]

[
r
1
6
r
3
]
r
[
r
1
6
r
3
]
3
=
r
1
2
r
3
r +
1
6
r
3
1
6
r
3
=

1
3
r
3
1
6
r
3
= 2.
(d) The actual value is )(0.1) = 1.99899998. Find the relative error for the values obtained
in parts (b) and (c).
Solution The relative error for the value obtained in part (b) is
1.941 (1.99899998)
1.99899998
= 0.029,
while the relative error for the value obtained in part (c) is
2 (1.99899998)
1.99899998
= 0.0005.
15. Use the 64-bit long real format to nd the decimal equivalent of the following oating-point
machine numbers.
(a) 0 10000001010 1001001100000000000000000000000000000000000000000000
Solution The sign bit : is 0, the exponent c is represented by 10000001010 in binary,
which is 2
10
+ 2
3
+ 2
1
= 1024 + 8 + 2 = 1034 in decimal, and the mantissa ) is
) = 2
1
+ 2
4
+ 2
7
+ 2
8
=
1
2
+
1
16
+
1
128
+
1
256
=
147
256
.
Therefore, the value of the oating point number, denoted by r, is
r = (1)
-
2
c1023
(1 + ))
= (1)
0
2
10341023
(
1 +
147
256
)
= 2
11
403
256
= 2
11
403
2
8
= 8 403
= 3224.
(b) 1 10000001010 1001001100000000000000000000000000000000000000000000
Solution This number is identical to the number in part (a), except that the sign bit :
is 1 instead of 0, so the value is 3224.
4
(c) 0 01111111111 0101001100000000000000000000000000000000000000000000
Solution The sign bit : is 0, the exponent c is given by
9

i=0
2
i
=
2
10
1
2 1
= 1023,
and the mantissa ) is
) = 2
2
+ 2
4
+ 2
7
+ 2
8
=
1
4
+
1
16
+
1
128
+
1
256
=
83
256
.
Therefore, the value of the oating point number, denoted by r, is
r = (1)
-
2
c1023
(1 + ))
= (1)
0
2
10231023
(
1 +
83
256
)
=
339
256
= 1.32421875.
(d) 0 01111111111 0101001100000000000000000000000000000000000000000001
Solution This number is identical to the one in part (c), except that there is an ad-
ditional digit in the mantissa corresponding to 2
52
. It follows that the value of this
number, denoted by r, is
r = 1.32421875 + 2
52
1.3242187500000002220446049250313.
17. Suppose two points (r
0
, j
0
) and (r
1
, j
1
) are on a straight line with j
1
= j
0
. Two formulas
are available to nd the r-intercept of the line:
r =
r
0
j
1
r
1
j
0
j
1
j
0
and r = r
0

(r
1
r
0
)j
0
j
1
j
0
.
(a) Show that both formulas are algebraically correct.
Solution The equation of the line is
j =
j
1
j
0
r
1
r
0
(r r
0
) + j
0
.
Setting j = 0 and solving for r, we obtain
j
0
r
1
r
0
j
1
j
0
= r r
0
5
or
r = r
0

(r
1
r
0
)j
0
j
1
j
0
,
which is precisely the second formula. If we use a common denominator, then we obtain
r = r
0
j
1
j
0
j
1
j
0

(r
1
r
0
)j
0
j
1
j
0
=
r
0
(j
1
j
0
) (r
1
r
0
)j
0
j
1
j
0
=
(r
0
j
1
r
0
j
0
) (r
1
j
0
r
0
j
0
)
j
1
j
0
=
r
0
j
1
r
1
j
0
j
1
j
0
which is precisely the rst formula.
(b) Use the data (r
0
, j
0
) = (1.31, 3.24) and (r
1
, j
1
) = (1.93, 4.76) and three-digit rounding
arithmetic to compute the r-intercept both ways. Which method is better and why?
Solution Using the rst formula, we obtain
r =
1.31 4.76 1.93 3.24
4.76 3.24
=
6.24 6.25
1.52
= 0.00658.
Using the second formula, we obtain
r = 1.31
(1.93 1.31)3.24
4.76 3.24
= 1.31
0.62 3.24
1.52
= 1.31
2.01
1.52
= 1.31 1.32
= 0.01.
The exact value, to three signicant digits, is -0.0116, so clearly the second formula is
better. The rst formula suers from catastrophic cancellation in the numerator.
19. The two-by-two linear system
or + /j = c,
cr + dj = ),
6
where o, /, c, d, c, ) are given, can be solved for r and j a follows:
set : =
c
o
, provided o = 0;
d
1
= d :/;
)
1
= ) :c;
j =
)
1
d
1
;
r =
c /j
o
.
Implement this algorithm using MATLAB and solve the following linear systems.
(a)
1.130r 6.990j = 14.20
8.110r + 12.20j = 0.1370
(b)
1.013r 6.099j = 14.22
18.11r + 112.2j = 0.1376
Solution The following function solves the general two-by-two linear system, given the values
of o, /, c, d, c and ).
function [x,y]=hw1prob1219(a,b,c,d,e,f)
% display error message if a is zero
if a==0,
error(a must be nonzero)
end
m=c/a;
d1=d-m*b;
f1=f-m*e;
y=f1/d1;
x=(e-b*y)/a;
In the following MATLAB session, this function is used to solve the systems in parts (a) and
(b).
7
>> [x,y]=hw1prob1219(1.130,-6.990,8.110,12.20,14.20,-0.1370)
x =
2.44459190435176
y =
-1.63628199543384
>> [x,y]=hw1prob1219(1.013,-6.099,-18.11,112.2,14.22,-0.1376)
x =
4.974388755065191e+002
y =
80.28948694672958
25. The binomial coecient
(
:
/
)
=
:!
/!(:/)!
describes the number of ways of choosing a subset of / objects from a set of : elements.
(a) Suppose decimal machine numbers are of the form
0.d
1
d
2
d
3
d
4
10
a
, 1 d
1
9, 0 d
i
9, i = 2, 3, 4, n 15.
What is the largest value of : for which the binomial coecient
(
:
/
)
can be computed
for all / by the denition without causing overow?
Solution The largest number that can be represented in this oating-point system is
0.9999 10
15
= 999, 900, 000, 000, 000. Using the denition of the binomial coecient,
overow will occur if :! is larger than this number. This is the case if :! 18, since
18! = 6, 402, 373, 705, 728, 000 and 17! = 355, 687, 428, 096, 000. Therefore the largest
value of : for which the binomial coecient can be computed without causing overow
is 17.
8
(b) Show that
(
:
/
)
can also be computed by
(
:
/
)
=
(
:
/
)
(
:1
/ 1
)

(
:/ + 1
1
)
.
Solution We have
(
:
/
)
=
:!
/!(:/)!
=
1 2 3 (:1) :
(1 2 3 (/ 1) /)(1 2 3 (:/ 1) (:/))
=
(1 2 3 (:/ 1) (:/))((:/ + 1) (:1) :)
(1 2 3 (/ 1) /)(1 2 3 (:/ 1) (:/))
=
(:/ + 1) (:1) :
1 2 3 (/ 1) /
=
(
:
/
)
(
:1
/ 1
)

(
:/ + 1
1
)
.
(c) What is the largest value of : for which the binomial coecient
(
:
3
)
can be computed
by the formula in part (b) without causing overow?
Solution We have
(
:
3
)
=
:
3
:1
2
:2
1
=
:(:1)(:2)
6
.
To avoid overow, this coecient must not exceed 0.9999 10
15
, which implies that :
must satisfy
:(:1)(:2) 6(0.9999) 10
15
5.9994 10
15
.
Since :(:1)(:2) :
3
for any nonnegative integer :, it follows that the largest
value of : for which the binomial coecient can be computed is not less than the
largest value of : for which :
3
5.9994 10
15
. This value is (5.9994 10
15
)
13

(1.81706 10
5
) 181, 706.
If we let : = 181, 706, we obtain :(:1)(:2) 5.999310
15
. To see if : can be any
larger, we try : = 181, 707 and obtain :(:1)(:2) 5.999399810
15
, so this value
is acceptable as well. However, if we try : = 181, 708, we obtain :(: 1)(: 2)
5.9994988 which is too large, so we conclude that the largest value of : is 181,707.
(d) Use the equation in part (b) and four-digit chopping arithmetic to compute the number
of possible 5-card hands in a 52-card deck. Compute the actual and relative errors.
9
Solution The number of possible 5-card hands in a 52-card deck is
(
52
5
)
=
52
5
51
4
50
3
49
2
48
1
.
Using four-digit chopping arithmetic, we obtain
(
52
5
)
(10.4)(12.75)(16.66)(24.5)(48)
(132.6)(16.66)(24.5)(48)
(2209)(24.5)(48)
(54, 120)(48)
2, 597, 000.
The actual value is 2,598,960, so the absolute error is 1,960 and the relative error is
7.541 10
4
.
10

You might also like