Professional Documents
Culture Documents
STATISTICS IN ENGINEERING
Semester 2, 2010
c School of Mathematical Sciences
c STATS 7053 Statistics in Engineering 2010 1-2
Line charts
0 1 2 3 4 5
Number of Fibres
c STATS 7053 Statistics in Engineering 2010 1-4
Histograms
c STATS 7053 Statistics in Engineering 2010 1-5
Example Compressive strengths in Mpa X for a ran-
dom sample of 200 concrete pavers are summarised
below.
c STATS 7053 Statistics in Engineering 2010 1-6
50
40
Frequency
30
20
10
0
40 50 60 70 80
Compressive Strength
It is calculated as follows:
2. Let m = (n + 1)/2.
M = x(m).
c STATS 7053 Statistics in Engineering 2010 1-9
The mode
c STATS 7053 Statistics in Engineering 2010 1-10
10
5
0
Maximum Inflow
c STATS 7053 Statistics in Engineering 2010 1-11
The sample mean is given by
1 21471
x̄ = {1864+44+. . .+412} = = 858.54.
25 25
c STATS 7053 Statistics in Engineering 2010 1-12
c STATS 7053 Statistics in Engineering 2010 1-13
1.5 Measures of Dispersion
c STATS 7053 Statistics in Engineering 2010 1-14
Example
c STATS 7053 Statistics in Engineering 2010 1-15
Properties of s
• s ≥ 0.
• s = 0 if and only if x1 = x2 = . . . = xn
c STATS 7053 Statistics in Engineering 2010 1-16
10
5
0
−3 −2 −1 0 1 2 3
c STATS 7053 Statistics in Engineering 2010 1-17
Example
For the Hardap Dam data, x̄ = 854.58 and s =
1312.8. Shown below are the percentages of obser-
vations in various ranges.
Remarks
c STATS 7053 Statistics in Engineering 2010 1-18
Example (continued)
If we take natural logs, the data are approximately nor-
mal.
8
6
Frequency
4
2
0
3 4 5 6 7 8 9
Ln(Peak Inflow)
c STATS 7053 Statistics in Engineering 2010 1-19
The coefficient of variation
c STATS 7053 Statistics in Engineering 2010 1-20
• If q is an integer, then
LQ = x(q);
• If q = r + 0.25 then
LQ = (3x(r) + x(r+1))/4;
• If q = r + 0.5 then
LQ = (x(r) + x(r+1))/2;
• If q = r + 0.75 then
LQ = (x(r) + 3x(r+1))/4.
c STATS 7053 Statistics in Engineering 2010 1-21
The upper quartile is defined similarly to have posi-
tion 3(n + 1)/4 in the ordered sample.
IQR = UQ − LQ.
c STATS 7053 Statistics in Engineering 2010 1-22
The range
c STATS 7053 Statistics in Engineering 2010 1-23
Skewness
c STATS 7053 Statistics in Engineering 2010 1-24
20
15
15
Frequency
Frequency
10
10
5
5
0
hardap x
Skewness=−1.21 Skewness=0.90
30
20
15
20
Frequency
Frequency
10
10
5
5
0
−10 −8 −6 −4 −2 0 0 5 10 15
x x
c STATS 7053 Statistics in Engineering 2010 1-25
The boxplot
5000
3000
●
1000
0
●
1.0
0.8
0.6
1 2 3 4 5
Batch
c STATS 7053 Statistics in Engineering 2010 1-28
i xi yi i xi yi
1 3 5 9 5 13
2 15 1 10 12 57
3 19 8 11 6 15
4 7 9 12 20 60
5 5 10 13 11 73
6 6 16 14 13 81
7 10 39 15 5 22
8 13 40 16 10 95
c STATS 7053 Statistics in Engineering 2010 1-29
A scatter plot of the data is shown below and the sum-
mary statistics are
●
80
●
60
●
benzoa pyrene
●
40
●
●
●
20
●
●
●
●
●
●
●
●
0
5 10 15 20
carbon monoxide
c STATS 7053 Statistics in Engineering 2010 1-30
In general:
+
80
+
60
+
benzoa pyrene
+
40
● +
+
20
+
+
+
+ + −
+
−
0
5 10 15 20
carbon monoxide
c STATS 7053 Statistics in Engineering 2010 1-32
The same will be true for points in the upper left quad-
rant.
c STATS 7053 Statistics in Engineering 2010 1-33
The sample correlation coefficient
• −1 ≤ r ≤ 1;
c STATS 7053 Statistics in Engineering 2010 1-35
r=0 r=0.1
● ●
●
● ●
● ● ●
● ● ●
● ● ● ●
● ●
●●
● ● ●
● ● ●
● ● ●
● ● ● ●
● ●● ●
●
● ● ●
● ●
● ● ● ●
● ● ●
● ● ●
● ●
● ●
●
● ●
r=−0.2 r=0.3
● ● ●
●
●
●
● ●
● ●
● ● ●
●
● ●
●
● ● ●
●
● ●
● ●
● ● ●
● ● ●
● ● ●
● ● ●
● ●
● ● ● ● ●
● ●
● ● ●
●
● ● ●
● ●
● ●
● ●
c STATS 7053 Statistics in Engineering 2010 1-36
r=−0.4 r=0.5
● ●
●
● ● ●
●
●
● ●
● ● ● ●
● ● ● ● ●
● ● ● ●
● ● ●
●
● ● ● ● ● ●
● ●
● ●
● ● ●●
● ●
● ● ●
●
● ● ●
● ●
●
● ● ●
●
● ●
r=−0.6 r=0.7
● ● ●
●
●
●
●
● ●
● ● ●
● ● ●
● ●
● ● ●
● ●
● ●
● ● ●
●
●
●● ● ●
● ●
●
●●
● ●●
● ●
● ●
● ●
● ●
● ●
● ●
●
●
●
● ●
● ●
c STATS 7053 Statistics in Engineering 2010 1-37
r=−0.8 r=0.9
● ●
●
●● ● ●
●
● ●
● ● ● ●
●
● ●
●
● ● ● ●●
●
● ● ● ●
●
● ● ●
● ● ●
● ●
●● ● ●
●
●
● ●
● ● ● ●
● ●
●● ● ●
● ●
●
● ●
r=−0.95 r=0.99
● ●
●
●
● ●
●
●
●
● ●
●● ●
●
●● ●
● ● ● ● ●
●●
● ●
● ●
●
● ● ●●
● ● ● ●
● ●
● ● ● ●
●
●●
●
● ●
●
● ● ●
●
●
● ●
● ●
c STATS 7053 Statistics in Engineering 2010 1-38
r=0.032
●
●
● ●
● ● ●
● ●
●
●
●●
●
●
●
●
●●
●
● ● ● ● ●
●
● ●
● ● ●
●
● ● ●
● ●
● ● ●
● ● ●
● ●
●
r=0.9
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
c STATS 7053 Statistics in Engineering 2010 1-40
100
50
Time
c STATS 7053 Statistics in Engineering 2010 1-42
c STATS 7053 Statistics in Engineering 2010 1-43
A very simple model for a time series is that it is com-
posed of several components:
• A trend;
• A periodic component;
• Random noise.
Trend Term 1.0 Periodic Term
4
0.5
3
0.0
x
s
2
−0.5
1
−1.0
0
0 20 40 60 80 100 0 20 40 60 80 100
6
2
4
1
n
y
0
2
−1
0
−2
−2
0 20 40 60 80 100 0 20 40 60 80 100
Time Time
rk = gk /g0.
c STATS 7053 Statistics in Engineering 2010 1-45
White Noise
2
1
0
z1
−2
0 20 40 60 80 100
Time
Series z1
1.0
0.6
ACF
0.2
−0.2
0 5 10 15 20
Lag
Fig. 1.18: Time series plot and acf function for white
noise. Note that the only non-zero correlation is at lag
0.
c STATS 7053 Statistics in Engineering 2010 1-46
0.0
−1.0
0 20 40 60 80 100
Time
Series z2
1.0
0.6
ACF
0.2
−0.2
0 5 10 15 20
Lag
Fig. 1.19: Time series plot and acf function for serial
dependence. Note the presence of positive autocor-
relation at lags 1, 2, 3.
c STATS 7053 Statistics in Engineering 2010 1-47