Professional Documents
Culture Documents
Chapter 1
1. Descriptive Statistics
1-3
1 Descriptive Statistics
1.1. Note on descriptive and inferential statistics
1.2. Types of data and some characteristics
1 LEARNING OBJECTIVES
measurements.
Calculate and interpret quartiles, deciles and percentiles.
them.
Create different types of charts that describe data sets.
1-5
C. Scales of Measurement
continued
and Percentiles
1-18
Quartiles (cont’d)
Quartiles (cont’d)
Quartiles (cont’d)
Quartiles (cont’d)
serial number of 3rd quartile (Q3) is
The
computed by
3
( N 1)th [3 * 21] / 4 63 / 4 15.75
4
Thus, Q3 is located at the 15.75th position
The 15th observation is 18, and the 16th
observation is 19.
Thus Q2 will lie 0.75 of the way from 18th to
19th values and is thus = 18 +1*0.75= 18.75.
1-24
Deciles (cont’d)
K
( N 1)th
10
1-26
Percentiles (cont’d)
Cont’d
Where,
L = lower class boundary of the Q1 class
N = number of observation in the data (total
frequency)
( f )
1
= sum of frequencies of all classes
lower than the Q1 class
f Q1
= frequency of the Q1 class
c = size of the class interval
1-36
Cont’d
Cont’d
Cont’d
1
(120 1) 15
Q1 59.5 4 *10 66.8
21
Interpretation:
About 25% of the X-factory’s
employees monthly salary is up to
66.8 Dollars or lower
1-39
p( N 1) ( f )1
p L *c
fp
Calculate the 60 percentile of the
distribution of monthly salary of
employees and interpret the result
1-40
Cont’d
0.60(120 1) 36
p 60 69.5 *10 78.01
43
Interpretation:
About 60% of the X-factory’s
employees monthly salary is up to
78.01 Dollars and less
1-41
Median Range
Interquartile range
Mode
Variance
Mean
Standard Deviation
Coefficient of variation (CV)
Mean Average
1-43
N
( f )1
Median L1 2 c
f median
1-45
Cont’d
Where,
L = lower class boundary of the median class
N = number of observation in the data (total
frequency)
( f ) = sum of frequencies of all classes lower than
1
Mode
Cont’d
1
Mode L1 c
1 2
1-50
Continued
Where,
L1
= Lower class boundary of modal
class (class containing the mode)
1 = excess of modal frequency over
frequency of the next lower class
2 = excess of modal frequency over
frequency of the next higher class
c = size of modal class interval
1-51
Means
Geometricmean
Harmonic mean
1-52
x
n
x
m= i =1
Ⱦ= i =1
N n
1-53
_
w1 x1 w2 x2 ... wk xk
wX
x1 x2 ... xk
wiXi
xi
1-54
(x − m)2
s =
2 i =1
s2 = i=1
N
(n − 1)
( )
2
( x)
2
N n
x
i =1
N
i =1 x −
n
−
x2 2
= n
i =1
= i=1 N
N (n − 1)
s= s
2
s= s 2
1-59
( x x )
2
37855
.
6 -9.85 97.0225 36 s
2 i 1
9 -6.85 46.9225 81 n 1 (20 1)
10 -5.85 34.2225 100
12 -3.85 14.8225 144 37855
.
13 -2.85 8.1225 169 19.923684
14 -1.85 3.4225 196
19
n x
2
14 -1.85 3.4225 196
15 -0.85 0.7225 225 n i1
x
2
16 0.15 0.0225 256
16 0.15 0.0225 256
i 1 n
16 0.15 0.0225 256 n 1
17 1.15 1.3225 289
2
17 1.15 1.3225 289 317 100489
18 2.15 4.6225 324 5403 5403
18 2.15 4.6225 324 20 20
19 3.15 9.9225 361 20 1 19
20 4.15 17.2225 400
5403 5024.45 37855
.
21 5.15 26.5225 441 19.923684
22 6.15 37.8225 484 19 19
24 8.15 66.4225 576
317 0 378.5500 5403
s s 19.923684 4.46
2
1-60
Standard Deviation-
Deviation- Grouped
frequencies
s
fi
mi 2
x
2
f
Where,
mi : is class midpoint (class mark)
fi : is frequency
X : sample mean
1-61
Coefficient of variation
Line graphs:
Ogives
Time plot
Pie charts
Bar graphs
Skewness and
Kurtosis
1-63
Frequency distribution
Frequency distribution
Cont’d
Cont’d
3. Determine the class interval for the 10
classes.
Common practice to determine class limits
is as:
lcl SRD 1
ucl (lcl 1) Cw
Where,
SRD is smallest number of the raw data,
lcl is lower class limit,
ucl is upper class limit
Cw is class width.
1-68
Cont’d
Cont’d
949 – 1342
The next class limits are
Table 2.1. Raw data of agricultural production (kg ha-1 yr-1) (an example)
Plot code Yield Plot code Yield Plot code Yield Plot code Yield
(kg ha-1) (kg ha-1) (kg ha-1) (kg ha-1)
Assignment 1.
Simple frequency
Cumulative frequency
x f(x)
Spending Class ($) Frequency (number of customers)
0 - 100 30
100 - 200 38
200 - 300 50
300 - 400 31
400 - 500 22
500 - 600 13
Total 184
1-74
x f(x) f(x)/n
Spending Class ($) Frequency (number of customers) Relative Frequency
0 - 100 30 0.163
100 - 200 38 0.207
200 - 300 50 0.272
300 - 400 31 0.168
400 - 500 22 0.120
500 - 600 13 0.070
Total 184 1.000
Continued….
Cont’d
Continued
200
180
160
Customers (number)
140
120
100
80
60
40
20
0
< 100
< 200
< 300
< 400
< 500
< 600
<0
Spending ($)
continued
continued
200
180
160
Customers (number)
140
120
100
80
60
40
20
0
> 100
> 200
> 300
> 400
> 500
> 600
>0
Spending ($)
Assignment 2
Histogram
Histogram Example
Frequency Histogram
1-85
Histogram Example
Skewness
Measure of asymmetry or symmetrical of a frequency
distribution
Skewed to left
Symmetric or unskewed
Skewed to right
Kurtosis
Measure of flatness or peakedness of a frequency distribution
Platykurtic (relatively flat)
Mesokurtic (normal)
Leptokurtic (relatively peaked)
1-87
Skewness
Skewed to left
30
25
25
20 20
20
Frequency
15
15
10
10
5
5
0 0
0
0 100 200 300 400 500 600 700
Monthly expenses (Dollar)
1-88
Skewness
Symmetric
1-89
Skewness
Skewed to right
1-90
Kurtosis
Kurtosis
Kurtosis