Professional Documents
Culture Documents
Commonly used methods are mean, median, mode, geometric mean etc.
x1 x2 ... xn x i
x i 1
n n
Methods of Center Measurement
mean and thus a better measure than the mean for highly skewed
distributions, e.g. family income. For example mean of 20, 30, 40,
between 20-40. So, the mean 270 really fails to give a realistic
value 990.
2.3. Measures of Dispersion (Variation):
The variation or dispersion in a set of values refers to how spread out the values are from
each other.
Larger variation
Smaller variation
· Same at Center
Smaller variation
Larger variation
Some measures of dispersion:
Range – Variance – Standard deviation
Coefficient of variation
Range:
Range is the difference between the largest (Max) and smallest (Min) values.
Range = Max Min
Example:
Find the range for the sample values: 26, 25, 35, 27, 29, 29.
Solution:
Range = 35 25 = 10 (unit)
Note:
The range is not useful as a measure of the variation since it only takes into account two of
the values. (it is not good, crude measure of variability)
Variance
x x
n 5
i
2
2
i x 34 .2
S 2 i 1
i 1
n 1 5 1
S2
10 34.2 2
21 34.2 2
33 34.2 2
53 34.2 2
54 34.2 2
4
1506.8
376.7 (unit) 2
4
Another method:
x x
5
x xi
2
x
xi i i
i 1
xi 34.2 xi 34.2
2 x
5
10 -24.2 585.64 171
21 -13.2 174.24 34.2
33 -1.2 1.44 5
53 18.8 353.44
54 19.8 392.04
1506 .8
S 2
4
xi x 0 xi x
5 5
xi 376.7
2
171 1506 .8
i 1 i 1
( x ) 2
(4) 16
2
(8) 64
2
(24) 576
2
Find the variance and
standard deviation
Answer Now
The math test scores of five students
are: 92,92,92,52 and 52.
1) Find the mean: (92+92+92+52+52)/5 = 76
2) Find the deviation from the mean:
92-76=16 92-76=16 92-76=16
52-76= -24 52-76= -24
3) Square the deviation from the mean:
(16)2 256(16)2 256(16)2 256
4) Find the sum of the squares:
256+256+256+576+576= 1920
The math test scores of five
students are: 92,92,92,52 and 52.
Answer Now
Analyzing the data:
Class A: 92,88,80,68,52
Class B: 92,92,92,52,52
Answer Now
Analyzing the data:
Class A: 92,88,80,68,52
Class B: 92,92,92,52,52
Class C: 77,76,76,76,75
Estimate the standard deviation for Class C.
a) Standard deviation will be less than 14.53.
b) Standard deviation will be greater than 19.6.
c) Standard deviation will be between 14.53
and 19.6
d) Can not make an estimate if the standard
deviation.
Answer: A
The scores in class C have the same
mean of 76 as the other two classes.
However, the scores in Class C are all
much closer to the mean than the other
classes so the standard deviation will be
smaller than for the other classes.
Summary:
S
C.V *100% (free of unit or unit
x less)
· The relative variability in the 1st data set is larger than the relative variability in the 2nd
data set if C.V1> C.V2 (and vice versa).
Sample Data Sample Sample Sample
mean st.dev. Variance
x1 , x 2 , , x n x S S2
ax1 , ax 2 , , ax n ax aS a2S 2
x1 b, ,, xn b xb S S2
ax1 b, , ax n b a x b aS a2S 2
Absolute value:
a a
a
if a 0
if a 0
Correlation
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
mmHg)
Wt. 67 69 85 83 74 81 97 92 114 85
SBP(mmHg)
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
220 (mmHg)
200
180
160
140
120
100
80 wt (kg)
60 70 80 90 100 110 120
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
16
14
12
Height in CM
10
0
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Negative relationship
Reliability
Age of Car
No relation
Correlation Coefficient
If r = l = perfect correlation.
How to compute the simple correlation
coefficient (r)
xy x y
r n
x
2
( x) 2
. y
2
( y) 2
n n
Example:
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.
xy x y
r n
( x) 2 ( y) 2
x
2 . y
2
n n
Age Weight
Serial
(years) (Kg) xy X2 Y2
n.
(x) (y)
1 7 12 84 49 144
2 6 8 48 36 64
3 8 12 96 64 144
4 5 10 50 25 100
5 6 11 66 36 121
6 9 13 117 81 169
Total ∑x= ∑y= ∑xy= ∑x2= ∑y2=
41 66 461 291 742
41 66
461
r 6
(41) 2 (66) 2
291 .742
6 6
r = 0.759
strong direct correlation
EXAMPLE: Relationship between Anxiety and Test
Scores
Anxiety Test X2 Y2 XY
(X) score (Y)
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
Calculating Correlation Coefficient
r = - 0.94
6 (di) 2
rs 1
n(n 2 1)
∑ di2=64
6 64
rs 1 0.1
7(48)
Comment:
There is an indirect weak correlation
between level of education and income.
Regression Analyses
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
By using the least squares method (a procedure
that minimizes the vertical deviations of plotted
points surrounding a straight line) we are
able to construct a best fitting straight line to the
scatter diagram points and then formulate a
regression equation in the form of:
ŷ a bX
x y
xy
ŷ y b(x x) b1
b n
( x) 2
x 2
n
Regression Equation
Regression equation
describes the
regression line
mathematically
– Intercept
– Slope
SBP(mmHg)
220
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
Linear Equations
Y
ŷY = bX
a +bX
a
Change
b = Slope in Y
Change in X
a = Y-intercept
X
Hours studying and grades
Regressing grades on hours
Linear Regression
90.00 Final grade in course = 59.95 + 3.17 * study
R-Square = 0.88
Final grade in course
80.00
70.00
41 66
461
b 6 0.92
2
(41)
291
6
Regression equation
x n
2 41678
20
ŷ =112.13 + 0.4547 x
for age 25
B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mmhg