Professional Documents
Culture Documents
Prepared by:
Engr. Kenny B. Cantila
Traffic Engineering
(number of accidents, number of vehicles, traffic delay, routing)
Hydrology
(rainfall intensity, flood frequency analysis, return period)
Earthquake Engineering
(earthquake occurrence, ground motions, seismic risk assessment)
Material Testing
(compressive strength, tensile strength, modulus of rupture, density, unit
weight, moisture content, porosity, void ratio)
Rainfall Data
Graphical Representation
Table1: Number of floods occurrences per year from 1939 to 1972 at the
gauging station of Calamazza on the Magra River, between Pisa
and Genoa in Northweastern Italy.
Number of Floods in a year
Number of Occurrences
0
0
1
2
2
6
3
7
4
9
5
4
6
1
7
4
8
1
9
0
34
A flood occurrence is defined as river discharge exceeding 300 m3/s
Dot Diagram
It is used to present continuous data. If the data are few (say, less than 25
items) a dot diagram is a useful visual aid.
Table 2: The first 15 items of modulus of rupture data measuring timber
strengths in N/mm2
29.11
29.93
32.02
32.40
33.06
34.12
35.58
39.34
40.53
41.64
45.54
48.37
48.78
50.98
65.35
Histogram
If there are at least, say, 25 observations, one of the most common graphical
forms is a block diagram called the histogram. For this purpose, the data are
divided into groups according to their magnitudes. The horizontal axis of the
graph gives the magnitudes. Blocks are drawn to represent the groups, each
of which has a distinct upper and lower limit. The area of a block is
proportional to the number of occurrences in the group. The variability of the
data is shown by the horizontal spread of the blocks, and the most common
values are found in blocks with the largest areas.
To draw a histogram, one divides the range into a number of classes or cells
nc. The number of occurrences in each class is counted and tabulated. These
are called frequencies.
nc =
where:
nc
n
iqr
r
Q1
Q3
=
=
=
=
=
=
1
rn3
2iqr
iq = 3 1
number of classes
number of data samples
interquartile range
range
first quartile (median of the lower half data)
third quartile(median of the higher half data)
Frequency Polygon
A frequency polygon is a useful diagnostic tool to determine the distribution
of a variable. It can be drawn by joining the midpoints of the tops of the
rectangles of a histogram after extending the diagram by one class on both
sides. We assume that equal class widths are used. If the ordinates of a
histogram are divided by the total number of observations, then a relative
frequency histogram is obtained. Thus, the ordinates for each class denote the
probabilities bounded by 0 and 1, by which we simply mean the chances of
occurrence. The resulting diagram is called the relative frequency polygon.
http://stattrek.com/statistics/charts/cumulative-plot.aspx
Class Width
nc = 1 + 3.3 log n
nc = 1 + 3.3 log(165)
nc = 8.32 say 9
w = /nc
w = 70.22/9
w = 7.80 say
Class Upper
Limit
(N/mm2)
8
16
24
32
40
48
56
64
72
Class
Center
(N/mm2)
4
12
20
28
36
44
52
60
68
Absolute
Frequency
Relative
Frequency
1
0
7
27
58
48
16
4
4
0.006
0.000
0.042
0.164
0.352
0.291
0.097
0.024
0.024
Cumulative
Relative
Frequency
0.61
0.61
4.85
21.21
56.36
85.45
95.15
97.58
100.00
64.00-71.99
56.00-63.99
48.00-55.99
40.00-47.99
32.00-39.99
24.00-31.99
16.00-23.99
8.00-15.99
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0-7.99
Relative Frequency
64.00-71.99
56.00-63.99
48.00-55.99
40.00-7.99
32.00-39.99
24.00-31.99
16.00-23.99
8.00-15.99
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0-7.99
Relative Frequency
Q 2 = 39.05
Q3 =
44.54 + 44.59
= 44.57
2
nc =
=
1
rn3
2iqr
1
(70.22)(165)3
2(11.66)
= 16.52 say
Class Width
w = r/nc
= 70.22/15
= 4.68 say
15
Class Upper
Limit
(N/mm2)
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
1
0
0
1
9
18
26
38
34
20
9
5
0
3
1
Relative
Frequency
Cumulative
Relative Frequency
0.006
0.000
0.000
0.006
0.055
0.109
0.158
0.230
0.206
0.121
0.055
0.030
0.000
0.018
0.006
0.61
0.61
0.61
1.21
6.67
17.58
33.33
56.36
76.97
89.09
94.55
97.58
97.58
99.39
100.00
70.00-74.99
65.00-69.99
60.00-64.99
55.00-59.99
50.00-54.99
45.00-49.99
40.00-44.99
35.00-39.99
30.00-34.99
25.00-29.99
20.00-24.99
15.00-19.99
10.00-14.99
5.00-9.99
0-4.99
Relative Frequency
0.250
0.200
0.150
0.100
0.050
0.000
65.00-69.99
60.00-64.99
55.00-59.99
50.00-54.99
45.00-49.99
40.00-44.99
35.00-39.99
30.00-34.99
25.00-29.99
20.00-24.99
15.00-19.99
10.00-14.99
5.00-9.99
0-4.99
Relative Frequency
0.250
0.200
0.150
0.100
0.050
0.000
Density (kg/m3)
2, 411
2, 415
2, 425
2, 427
2, 427
2, 428
2, 429
2, 433
2, 434
2, 435
24, 264
Given the data from the concrete test above, perform the following:
x
24,264
=
= , . /
n
10
x=
x
537.5
=
= .
n
10
53.4 + 54.4
= .
2
1
1
n
1
n=1 x n
1
1 1
1
1
+
+
3 0.20 0.24 0.16
= 0.19 /
xi1/n
xg =
i=1
Example: Suppose a storage bin for cement has dimensions 9.0 m x 2.0 m x
1.5m. Find the mean value of length, width and height such that the volume
of bin remains the same.
x = (9 2 1.5)1/3 = 3 m
Population growth: Consider the case of populations of towns and cities that
increase geometrically, which means that a future increase is expected that is
proportional to the current population. Such information is invaluable for
planning and designing urban water supplies and sewerage systems. Suppose,
for example, that according to a census conducted in 1970 and again in 1990
the population of a city had increased from 230,000 to 310,000. An engineer
needs to verify, for purposes of design, the per capita consumption of water in
the intermediate period and hence tries to estimate the population in 1980.
The central value to use in this situation is the geometric mean of the two
numbers which is
xi1/n
xg =
i=1
xg = 230,000 310,000
= 267,021
1/n
Measures of Dispersion
A measure of dispersion represents the degree of scatter shown by
observations or the inherent variability in a phenomenon under
observation. Dispersion also indicates the precision of the data. One
method of quantification is through an order statistic, that is, one of
ranked data. The simplest in the category is the range.
Measures of Dispersion
Mean absolute deviation (d) - measures the average absolute deviation from
the sample mean.
1
d=
n
xi x
i=1
Standard Deviation- it is the root mean square deviation about the mean.
n
s=
i=1
xi x
n
Example: Annual rainfall. If the annual rainfalls in a city are 50, 56, 42,
53, and 49 cm over a 5-year period with mean value of 50 cm, determine
the following:
a. Absolute deviation from the mean
b. Standard deviation from the mean
Part a:
1
d = * 50 50 + 56 50 + 42 50 + 53 50 + 49 50 +
5
d = 3.6 cm
Part b:
s=
1
50 50
5
s = 4.69 cm
+ 56 50
+ 42 50
+ 53 50
+ 49 50
Measure of Asymmetry
Another important property of the histogram or frequency polygon is its
shape with respect to symmetry (on either side of the mode). The sample
coefficient of skewness measures the asymmetry of a set of data about its
mean. For a sample of observations, x1 ,x2 ,, xn it is defined as
1
1 = 3
xi x
i=1
Skewness
Value of g1
Skew
Remarks
Positive
Positive skew
Zero
Normal distribution
Negative
Negative skew
Negative skew
(skewed to the left)
Normal distribution
(symmetrical)
Positive skew
(skewed to the right)
Skewness
A unitless indicator used in distribution analysis as a sign of asymmetry
and deviation from a normal distribution.
Interpretation:
Skewness > 0 - right skewed distribution
- most values are concentrated on left of the mean, with
extreme values to the right.
Measure of Peakedness
The extent of the relative steepness of ascent in the vicinity and on either side
of the mode in a histogram or frequency polygon is said to be a measure of its
peakedness or tail weight. This is quantified by the dimensionless sample
coefficient of kurtosis, which is defined for a sample of observations,
x1 ,x2 ,, xn by
1
2 = 4
xi x
i=1
Kurtosis
Value of g2
Distribution
Peakedness
>3
Leptokurtic distribution
Peak
=3
Mesokurtic distribution
Normal
<3
Platykurtic distribution
Flat
Kurtosis
A unitless indicator used in distribution analysis as a sign of flattening or
"peakedness" of a distribution.
Interpretation:
References
http://onlinestatbook.com/2/graphing_distributions/freq_poly.html
http://www.spcforexcel.com/knowledge/basic-statistics/are-skewness-andkurtosis-useful-statistics
http://www.graphpad.com/guides/prism/6/statistics/index.htm?stat_skewness
_and_kurtosis.htm
http://www.intercapital.ro/en/intercapital_start/explicatii/distribKurtosis.htm
https://en.wikipedia.org/wiki/Normal_distribution
http://chubbyrevision.weebly.com/representation-of-data.html
References
http://www.visualmining.com/resources/ncs_analytics/ncs_analytics_6/
http://www.slideshare.net/indiandentalacademy/statistical-tests
https://www.spcforexcel.com/knowledge/basic-statistics/are-skewness-andkurtosis-useful-statistics