Mbe2036 3

MBE2036 Engineering Computing, part 3, Version 8
City University of Hong Kong
MBE 2036
Quantifying Error
Part 3
Dr. Yajing Shen

Office: AC-1 G6617
Email: yajishen@cityu.edu.hk
1
Department of Mechanical
and Biomedical Engineering
2
Fu=cv
Short review
Negative
direction
dv c
Newton's law: g v (Eq 1.20)
dt m
Mathematical model
Analytical methods: exact solution

c
m t
v g (1 - e m ) Eq. 1.29
c Positive
direction
Numerical methods: approximate solution

c
v (t i 1) v (t i ) g v (t i )(t i 1 t i ) Eq 1.33
m Fd=mg
and Biomedical Engineering City University of Hong Kong
3
Short review
Inflow
U
dH
Conservation law: U kH Eq 2.1 Water Tank
dt H Outflow
Mathematical model V=kH
Change = Increase - Decrease

Analytical methods: exact solution
U
(1 e k t ) H0e k t H Eq2.14
k
Numerical methods: approximate solution
H (t i 1 ) H (t i ) [U kH(t i )]t Eq 2.16
4
Inflow
1 1 U
= ln + = ln +

Water Tank
H Outflow
V=kH
1
ln Z t C Eq 2.7 U=15m/s
k H0=40m
k =0.5
1
ln Z0 Eqln2.8Z, 0 where Z 0 U kH0
1
CC EqZ2.8
where 0 U, kH 0
k k
Z may be larger or smaller than
Zero (mathematically). However,
Assign: = for this practical problem, Z must
be have the same sign with Z0.
1
dZ dH Eq
(Z02.5
=15-0.5x40<0)
k
(Z=15-0.5xH00 before stable)
1 dZ
k ) Z dt
(
In engineering problem, we know there must be a result. So,
we usually simplify the mathematics equations during the
Department of Mechanical solving process.
5
Short review
Numerical methods provide an

approximation of the exact solutions of
the engineering problems
6
Errors in Numerical Methods

Error caused by computer approximation
round-off error
e.g Floating point numbers approximated by binary
numbers
Error caused by mathematical approximation
Truncation error
e.g dv v v (t i 1) v (t i )
Eq. 1.31
dt t t i 1 t i
7
Some basic concepts

of errors
8
Accuracy and Precision

Archer
Which one is good?
Accuracy consists of Trueness

(proximity of measurement results
to the true value)
Precision (repeatability or
reproducibility of the measurement)
MBE2036 Engineering Computing, part 3, Version 8 9
Accuracy and Precision

Increasing accuracy
Accuracy refers to how
closely a computed or
measured value
agrees with the true
Increasing Precision
value
Precision refers to how

closely individual
computed or measured
values agree with each
other
10
Significant Figures
The significant figures of a number are those digits that carry meaning
contributing to its precision. They are critical when reporting scientific
data because they give the reader an idea of how well you could actually
measure/report your data.
?
1 2 3 4 5cm
Ruler in centimeter
2.5 2.6
From the above ruler, I would estimate that the measurement ? would
be around 2.52cm.
11
Significant
A
Figures
1 2 3 4 5cm
Ruler in centimeter
2.5 2.6
From the above ruler, I would estimate that the measurement A would
be around 2.52cm.
However, some of you may say A is 2.51cm, 2.53cm or even 2.50cm
Basically, all of us would agree that A is at least 2.5cm. So, the first two
digits are certain digits and only the third digit which we have to estimate.
A digit is said to be a significant figure if it is either known with certainty or

if it is the first estimated digit.
So, we could say that the above measurement has 3 significant figures
12
Significant Figures
49 48.3? 48.7? 48.8? 48.9?
48
Certain Approximation
We say it has 3 significant figures
We say it has 7
significant figures
873244
873244.3?
873244.4?
873244.6? 873245
An automobile speedometer and odometer
13
Significant Figures
We say the odometer has 7 significant figures
873244.3?
873244.4?
873244.6?
873244.6233582343903443
These numbers are

Certain
meaningless.
Odometer The first estimated digit
Note: In measurement 1 1.0 1.00 , because they mean different precision.

2 significant figures
4.8 8 is approximate
4.80 8 is exact
3 significant figures
14
True error
True Error = True value Approximation
The true error does not take into the account the order of
magnitude of the value under examination
Bridge Screw
Measured length= 9,999 cm Measured length= 9 cm

True length= 10,000 cm True length = 10 cm
True error = 10,000-9,999=1cm True error = 10-9cm=1cm
Both true errors are 1 cm. However, the true error of the
screw is more significant than the true error of the bridge.
15
True error
True Error = True value Approximation
The true error does not take into the account the order of
magnitude of the value under examination
CityU to airport Hundred yards race Extract a tooth
If the true error is 10cm in each case, so

16
Relative error
One way to overcome this problem is to normalize the
error to the true value, as in:
True value - Approxima tion
t 100% Eq 3.1
True value
t is the true percent relative error
Bridge Screw
Measured length= 9,999 cm Measured length= 9 cm

True length= 10,000 cm True length = 10 cm
= (10,000-9,999)/10,000=0.01% = (10-9)/10=10%
17
Relative error
However, in many practical applications, it is not possible
to find the true value in advance.
One alternative is to normalize the error using the best
available approximation of the true value, i.e:
approximate error
a 100% Eq 3.2
approximation
The approximate percent relative error
Questions: If we dont know the true value, how

do we decide the approximate error???
18
Relative error in numerical computing

Certain numerical methods use iterative approach to
compute answers. In such case, a present approximation
is made on the basis of a previous approximation.
This process is performed repeatedly to successively
compute (we hope) better and better approximations.
Previous Present
approximation approximation True value
0 1 i i+1
19

Previous Present
approximation approximation True value
0 1 i i+1
True value Present approximation
For such cases, the percent relative error can be determined according:
present approximat ion - previous approximat ion
a 100% Eq 3.3
present approximat ion
The approximate percent relative error
Question: when should PC stop calculating? Answer: The error is small enough
MBE2036 Engineering Computing, part 3, Version 8 20

-stopping criterion-
The iteration can be repeated until:
a s Eq 3.4
where s is a stopping criterion.
It is often convenient to relate this error to the number

of significant figures/digits in the approximation. It can
be shown (Scarborough, 1996) that if the following
criterion is met, we can be assured that the result is
correct to at least n significant figures.
s (0.5 102n )% Eq 3.5
Note: Please refer the proof from the appendix.
21
Example Stopping Criterion

Consider the following exponential function:
2 3 n
x x x
ex 1 x ...
2 3! n!
Estimate e0.5 correct to at least 3 significant figures,
starting from ex=1 and add one term at a time:
The true value of e0.5=1.648721
Stopping criterion s (0.5 1023 )% 0.05%
The first estimate e0.5 =1
The second estimate e0.5 =1 + x = 1+0.5=1.5
x2 0. 5 2
The third estimate e 1 x 1 0.5
0.5
1.625
2 2
22

x2 x3 xn
e 1 x
x
...
2 3! n!
A true percent relative error

Approximate value
for the first estimate:
True Approximate N=1 N e0.5
1.648721 1
t1 100% 39.3469% 1 1
1.648721
True
2 1.5
3 1.625
An approximate percent relative error 4 ..
for the first estimate True value: e0.5=1.648721
Approximate N=1 Initial value
1?
1 = 100 =?
1
Approximate N=1
23

x2 x3 xn
e 1 x
x
...
2 3! n!
A true percent relative error Approximate value
for the second estimate: N e0.5

True Approximate N=2
1 1
1.648721 1.5
2 = 100% = 9.02% 2 1.5
1.648721
True 3 1.625
True value: e0.5=1.648721
for the second estimate
Approximate N=2 Approximate N=1
1.5 1
a2 100% 33.3%
1.5
Approximate N=2
24

x2 x3 xn
e 1 x
x
...
2 3! n!
A true percent relative error
Approximate value
for the third estimate: True Approximate N=3 N e0.5
1.648721 1.625
t 100% 1.44% 1 1
1.648721 2 1.5
True
3 1.625
for the third estimate: True value: e0.5=1.648721
Approximate N=3 Approximate N=2
1.625 1.5
a 100% 7.69%
1.625
Approximate N=3
25

The results: True value: e0.5=1.648721
Stopping criterion s (0.5 1023 )% 0.05%
Number Approximation t% a%
of terms Result: e0.5
1 1 39.3 -
2 1.5 9.02 33.3
Since a is less
3 1.625 1.44 7.69
than s(0.05%),
4 1.645833333 0.175 1.27 the calculation
5 1.648437500 0.0172 0.158 stops
6 1.648697917 0.00142 0.0158
The result is accurate to 5 significant figures
26
Round-off errors
in numerical computing
27
Round-off Errors
Digital computers have range and
precision limits on their ability to represent
numbers
Certain numerical manipulations are highly
sensitive to round-off errors. This can
result from both mathematical
considerations as well as from the way in
which computer perform arithmetic
operations.
28
Computer Representation of
numbers
2 7 0
1 6 8
3
0 9
4 5
Our friend the computer is like a two-

We have 10 fingers and 10 toes, the number fingered animal who is limited to two
system that we are most familiar with stateseither 0 or 1. Numbers on the
is the decimal, or base-10, number system. computer are represented with
a binary, or base-2, system.
29
Computer Representation of
numbers
Number system
Decimal : 86,409
8 104 + 6 103 + 4 102 + 0 101

+ 9 100 = 86,409
Binary: 10101101
1 27 + 0 26 + 1 25 + 0 24
+ 1 23 + 1 22 + 0 21 + 1 20
= 173
30
Integer numbers
Representation of integer
Signed magnitude method: employs the first bit of a word to indicate the sign,
with a 0 for positive and a 1 for negative.
Example: The representation of the decimal integer 173 on a 16-bit

computer using the signed magnitude method.
31
Integers numbers
Range of integers
Computer software uses fixed word size (number of bits)
to represent number.
16-bit integer can cover the range of integer from -32,768 to
32,767.
32-bit long integer can cover the range of integer from -
2,147,483,648 to 2,147,483,647
0
16-bit: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
32
Floating point numbers

In general, the numbers can be represented approximately to a
fixed number of significant digits (the significand) and scaled
using an exponent.
Significant digits baseexponent

10-base 1234.5 1.2345 x 103
-0.00012345 -1.2345 x 10-4
2-base
110.110001 1.10110001 x 22
-0.00110110 -1.10110 x 2-3
33

MATLAB handles floating-point numbers in either single precision or
double precision (default setting) format, based on IEEE (Institute of
Electrical and Electronics Engineers) Standard 754.
IEEE floating point numbers have three basic components: the sign
(), the exponent (e), and the mantissa.
1.10110 x 2-3
() Bias (e is defined >0, so

1. 2 the bias is defined to
generate the + and
Fraction bits numbers)
Implicit leading digit
0 denotes a positive
The mantissa is composed of the fraction (f) and
1 denotes a negative an implicit leading digit (1).
Significant
and Biomedicaldigits baseexponent
Engineering City University of Hong Kong
34
Computer Representation of numbers

The exponent field needs to represent both positive and negative exponents. To
do this, a bias is added to the actual exponent in order to get the stored exponent.
For IEEE single-precision floats (32bits), the exponent field is 8

bits, the bias value is 127.
e.g., A stored value of 200 indicates an exponent of (200-127)= 73.
For double precision (64bits), the exponent field is 11 bits, and
has a bias of 1023.
e.g., A stored value of 2000 indicates an exponent of (2000-1023)= 977.
Sign Exponent Fraction Bias

Single Precision (32bits) 1 [31] 8 [30-23] 23 [22-0] 127
Double Precision(64bits) 1 [63] 11 [62-52] 52 [51-0] 1023
Value: 0- 25510
Bit 63 Bit 0
Value: 0- 204710
35

Significant digits baseexponent 1.10110 x 2-3
127 for single

precision floats
() Bias
1. 2 1023 for double
precision floats
Fraction bits
Implicit leading digit
The mantissa is composed of the fraction (f)

and an implicit leading digit (1).
36
Example1
Example: considering the simplified floating point (+0.10002), try to convert it
into this format:
1. 2()
1 bit 4 bits 4 bits Simplified 9 bits floating

point number format
Sign Exponent Mantissa
Bias = 7
Exponent
Bias = 7 Mantissa
Sign
+0.510
23 20 2-1 2-4 In normalized format, there is
0.10002= 1.00002 x 2-1 = 0 0110 0000 an implicit leading bit, i.e.
1.00002. This implies:
Leading 1.00002 x 2-1
bit exponent 7+(-1)= 6
In decimal notation, 0.1310 can be written as 1.310 x 10-1 without changing the original value.
Similarly, 0.10002 can be written as 1.00002 x 2-1 without changing the original value.
37
Example2
+0.2510 Implicit
leading bit
7+(-2)=5
0.01002= 1.00002 x 2-2 = 0 0101 0000
1.00002
11.00002= 1.1000 x 21= 0 1000 1000

7 +1 =8 1.10002
Implicit
310 leading bit
1.10002
0.00112 = 1.1 x 2-3 = 0 0100 1000
7+(-3)= 4
0.187510
Note: Implicit leading bit is 1
38

Error
Some irrational numbers cannot be represented exactly,
e.g. , , 7
Base-10 number cannot always represented exactly in a
base-2 floating point system. For example:
0.110 = 1x2-4 + 1x2-5 + 0x2-6 + 0x2-7 +1x2-8 + 1x2-9 + 0x2-10 + 0x2-11 +1x2-12+1x2-13 +
= 0001100110011.2 = 1.100110011. 2x2-4 has a "1100"
sequence
= 0 0011 1001 1 bit 4 bits 4 bits continuing endlessly
More bits in mantissa

-4+7 = 3
can increase accuracy
In normalised format, there is an implicit leading bit (1), so 1001 is in fact

1.1001. This implies:
1 x 2-4 + 1 x 2-5+ 1x 2-8 = 0.09765625 Relative error=2.34%
39

Single precision floats can represent about 7 base-10
digits, e.g. can be expressed as 3.141593.
Double precision floats can represent about 15 base-
10 digits, e.g. can be expressed as
3.14159265358979.
40

Range
Floating point numbers also have the range limits. Example,
in MATLAB:
>> format long Format the output display to
The smallest positive >> realmin double precision numbers
normalized double ans =
precision floating point
number 2.225073858507201e-308 MATLABs
default setting
>> realmax
is double
The largest positive ans = precision format
normalized double
1.797693134862316e+308
precision floating point
number
41
Errors Caused by Arithmetic Manipulations

of Computer Number
When two floating point numbers are added together, the numbers are first
expressed so that they have the same exponents.
The simplified floating point addition can be illustrated in the following
flowchart:
Start
Compare the exponents of

the two numbers
Add the mantissas
together
Shift the mantissa of the smaller
number to the right one bit
Normalise the sum
No Does the smaller yes

exponent
matches the Done
larger exponent?
42
Why shift?
Consider the following simple 10-base problem:

(310) (0.510)
3 102 + 4 101 =? ? ? 1.1 21 + 1.0 21 =? ? ?
0.4 10 0 0.10 20
4 shifts to right 1.0 shifts to right
0.04 101 0.010 21
0.004 102 = (1.1 + 0.010) 21

= (3 + 0.004) 102 = 1.110 21 (3.510)
= 3.004 102 Thats how computer does
43

of Computer Number
For example: 3+0.1875 = 3.1875
310 = 11.00002= 1.10002 x 21 = 0 1000 1000
Implicit 7+1=8 1.10002

leading bit
1.10002
0.187510 = 0.00112 = 1.10002 x 2-3 = 0 0100 1000
Convert to the same 0.00011 x 21 7+(-3)= 4
exponent as 310
The mantissa has to shift right 1-(-3)=4 times 0.00011 This bit is outside the range limit
After expressing in the same exponent, the mantissas can then be added together:
= 1.10002 + 0.00012 = 1.10012 Compute the sum of aligned
mantissas
Therefore 310 + 0.187510 = 0 1000 1001
= 21 x (1 x 20 + 1 x 2-1 + 1 x 2-4)
= 3.125,
i.e error = 0.0625, Error% =0.0625*100/3.1875 =1.96%
44

of Computer Number (cont)
Adding a small number to a number. For
example: 110+0.0312510=1.0312510
110 = 1.00002 x 20 = 0 0111 0000
7+0=7 7+(-5)=2
0.0312510 = 1.00002 x 2-5 = 0 0010 0000
Shift mantissa right 5 times in order to
convert the exponent to match 110
This bit is outside the range of the mantissa

The mantissa is now : 0.000012
Add the mantissas together 1.00002 + 0.00002 = 1.00002
Therefore: 110+0.0312510= 0 0111 00002 = 1.0000010 The addition does not

produce the expected
result
45

Machine epsilon represent the finest level of resolution
that is possible for float-point arithmetic
In MATLAB, the command eps tells you the epsilon
value for MATLAB:
>> format long

>> eps
ans =
2.220446049250313e-016
46

Large numbers of arithmetic operations can make
the accumulated errors becoming significant.
See the following MATLAB examples:
>> s=0;
>> s=0;
>> for i=1:10000
>> for i=1:100
s=s+0.0001;
s=s+0.0001;
end The correct value
end
The correct value is >> s should be 1
>> s
0.01 s=
s=
0.99999999999991
0.01000000000000
The accumulated error becomes
For smaller numbers of computation, significant when there is a large
MATLAB still can compute correctly amount of computation!!!
the result.
47
Summary
Computer represents numbers in a fixed word size. As a result, it limits
the range and precision that the computer/software can handle
numbers.
Normally, computer/software will provide single precision and double
precision format for handling numbers.
Double precision has bigger range and higher precision than single
precision format.
If the numbers is outside the range or precision limits of the single
precision format, double precision should be used.
Double precision occupies more computer memory and takes longer to
process. For small micro-processor systems, they may not be fast
enough to handle too many double precision floating point calculation.
Floating-point numbers cannot be represented exactly in computer. So,
if possible, use integer numbers for calculation.
Always make sure the number for arithmetic is bigger than the machine
epsilon.
Small errors can be accumulated to form a more significant error. So,
try to avoid performing large numbers of arithmetic operations on the
same variables.
48
Learning Outcomes
After this lecture, the student would be able to understand the
following:
Basic concepts: accuracy, precision, significant figures, errors

How to quantify error
How error estimates can be used to terminate an iterative
calculation
How round-off errors occur because digital computers have a
limited ability to represent numbers
IEEE 754 floating-point number format
49
Appendix
50
-Stopping criterion-
The following criterion assure that the result is correct to at
least n significant figures.
s (0.5 102n )% Eq 3.5
Proof: The stopping criterion is defined as:
1
| | < | | =
| |
When:
1 2
1
= 10 % = 10
2 2
We have:
1

1
10 Equation 1
| | 2
51
s (0.5 102n )% Eq 3.5
Proof: For the true value x, and the approximation value x* :
= 0. 10 (m is integer)
= 0. 1 2 10 (m is integer)
It is clear that: 0 10 0 10
So, we have:
1

1

1

10
10 | | 2
Or written as: 1

1 10 Equation 2
2
52
s (0.5 102n )% Eq 3.5
Proof: For the true value x, and the approximation value x* :
= 0. 10 (m is integer)
= 0. 1 2 10 (m is integer)
The definition of significant figures n is:

1
10 Equation 3
2
1
Because: 1

2 10 (eq.2)
Therefore, this criterion can assure that the

result is correct to at least n significant figures.
53
Single precision floating point

(32bits) for
The real number , represented in binary as an infinite series of bits is:
11.0010010000111111011010101000100010000101101000110000100011010011...
but is:
11.0010010000111111011011
when approximated by rounding to a precision of 24 bits.
In binary single-precision floating-point (32bits), this is represented as s =

1.10010010000111111011011. This has a decimal value of
3.1415927410125732421875.
54
Addition by number shifting
55
Base-2 Base-10
2 2 21 2 0 2-1 2-2 2-3
101.101
1x22 + 0x21 + 1x20 + 1x2-1 + 0x2-2 + 1x2-3
= 4 + 0 + 1 + 0.5 + 0 + 0.125
= 5.625
56
Base-10 Base-2 (Optional)

6. 187510
2 6 remainder
0.1875
2 3 0 x 2
2 1 1 0.3750
0 1 x 2
0.7500
x 2
1.5000
Remove the carry bit: 1
110.00112 x
0.5000
2
1.0000
carry
1.100011 x 22

Mbe2036 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mbe2036 3

Uploaded by

Copyright:

Available Formats

MBE2036 Engineering Computing, part 3, Version 8

City University of Hong Kong

Dr. Yajing Shen

Analytical methods: exact solution

Numerical methods: approximate solution

Change = Increase - Decrease

Numerical methods: approximate solution

H (t i 1 ) H (t i ) [U kH(t i )]t Eq 2.16

Numerical methods provide an

Errors in Numerical Methods

Some basic concepts

Accuracy and Precision

Which one is good?

Accuracy consists of Trueness

Accuracy and Precision

Precision refers to how

A digit is said to be a significant figure if it is either known with certainty or

These numbers are

Note: In measurement 1 1.0 1.00 , because they mean different precision.

Measured length= 9,999 cm Measured length= 9 cm

CityU to airport Hundred yards race Extract a tooth

If the true error is 10cm in each case, so

Measured length= 9,999 cm Measured length= 9 cm

Questions: If we dont know the true value, how

Relative error in numerical computing

Relative error in numerical computing

True value Present approximation

Relative error in numerical computing

It is often convenient to relate this error to the number

Example Stopping Criterion

Example Stopping Criterion

A true percent relative error

Example Stopping Criterion

A true percent relative error Approximate value

for the second estimate: N e0.5

Example Stopping Criterion

Example Stopping Criterion

Stopping criterion s (0.5 1023 )% 0.05%

The result is accurate to 5 significant figures

Our friend the computer is like a two-

8 104 + 6 103 + 4 102 + 0 101

Example: The representation of the decimal integer 173 on a 16-bit

Floating point numbers

Significant digits baseexponent

Floating point numbers

() Bias (e is defined >0, so

Computer Representation of numbers

For IEEE single-precision floats (32bits), the exponent field is 8

Sign Exponent Fraction Bias

Floating point numbers

127 for single

The mantissa is composed of the fraction (f)

1 bit 4 bits 4 bits Simplified 9 bits floating

11.00002= 1.1000 x 21= 0 1000 1000

Floating point numbers

More bits in mantissa

In normalised format, there is an implicit leading bit (1), so 1001 is in fact

Floating point numbers

Floating point numbers

Errors Caused by Arithmetic Manipulations

Compare the exponents of

Normalise the sum

No Does the smaller yes