You are on page 1of 70

STATISTICAL ORGANIZATION OF

SCORES

Example: Consider the raw scores of 25 students on a


50-item test in .

35 47 44 42 38
49 43 46 48 48
50 34 48 46 41
40 44 36 47 47
47 49 46 47 46
 A.

Classes f
11 – 22 3
23 – 34 5
35 – 46 11
47 – 58 19
59 – 70 14
71 – 82 6
83 – 94 2
______________
n = 60
B.

Courses No. of Sudents


BSED Math 7
BSEDFil 14
BSED English 12
BSED Science 16
AB Econ. 11
_____________
n = 60
DEFINITION OF TERMS
 QUANTITATIVE FREQUENCY DISTRIBUTION
TABLE when data are tabulated based on
numerical classes or interval.
 QUALITATIVE FREQUENCY DISTRIBUTION
TABLE – when data are tabulated based on
description.
 CLASS LIMITS – are the lowest and the highest
value that can go in each class.
 LOWER CLASS LIMIT – is the lowest value that
can be entered in a class.
 UPPER CLASS LIMIT – is the highest value that
can be entered in a class.
CLASS BOUNDARY – is considered the “true
limit”. It is a value midway the upper limit of a
certain class and the lower limit of the next
class. If the class limits are simple round
numbers, the class boundary can be obtained
by simply adding 0.5 to the upper limit and
subtracting 0.5 from the lower limit.
 CLASS WIDTH OR CLASS SIZE - is represented
by c or i. It can be obtained using several methods.
a. Getting the difference between the boundaries of a
certain class.
b. Getting the difference between two successive
lower limits or two successive upper limits.
 FREQUENCY – which is denoted by f is the
number of values that fall in a certain class.
 CLASS MARK OR MIDPOINT – is a value that acts
as representative of a certain class. It can be
obtained by using the relation;
X = U1 + L1/2
CONSTRUCTION OF A FREQUENCY
DISTRIBUTION

1. Get the lowest and the highest value in the distribution.


2. Get the RANGE.
3. Determine the number of classes. In some instances, the
number of classes can be approximated by the relation,
K = 1 + 3.3 log n
4. Determine the size of the class interval

C = R/k
5. Construct the class.
6. Determine the “f” of each class.
DERIVED FREQUENCY
DISTRIBUTION
a. Relative frequency distribution
b. Cumulative frequency distribution

RELATIVE FREQUENCY
DISTRIBUTION

 Can be obtained by dividing the class frequency


by the sample size and multiplying the result by
100%.
 Given by the relation

%f = f/n X 100
Respondents of the Study
SM
Subj n % R

A 35 23 3

B 45 29 2

C 75 48 1

Total 155 100


CUMULATIVE FREQUENCY
DISTRIBUTION
 Can be obtained by simply adding the
class frequencies.
 Tries to determine “partial sums” from
the data classified in terms of classes.
 Answer problems like;
a. Number of students who got a passing
mark.
b. Number of students who got a failing
grade.
TWO TYPES OF CUMULATIVE
FREQUENCY DISTRIBUTION
A. Less than cumulative frequency distribution –
refers to the distribution whose frequencies are
less than or below the upper class boundary they
correspond to.
B. Greater than cumulative frequency distribution –
refers to the distribution whose frequencies are
greater than or above the lower boundary they
correspond to.
Example:

Classes f X Class boundary %f <cumf >cumf

11 – 22 3 16.5 10.5 – 22.5 5 3 60

23 – 34 5 28.5 22.5 – 34.5 8.33 8 57

35 – 46 11 40.5 34.5 – 48.5 18.33 19 52

47 – 58 19 52.5 46.5 – 58.5 31.67 38 41

59 – 70 14 64.5 58.5 – 70.5 23.33 52 22

71 – 82 6 76.5 70.5 – 82.5 10.00 58 8

83 – 94 2 88.5 82.5 – 94.5 3.33 60 2


WORKSHOP
Example:
A researcher assumed that age is one of the
factors affecting the level of development-
orientedness of teachers. He was able to gather
the ages of 75 teachers shown below.
49,54,53,48,37,41,33,45,44,46,54,41
53,46,46,48,35,46,42,56,30,39,48,52
45,48,31,55,43,25,44,44,72,31,54,32
48,57,44,65,26,43,50,46,37,51,50,49
38,38,52,48,66,39,48,59,59,47,58,63
56,38,46,53,53,46,43,48,54,45,44,61
49,33,39
I. Construct a frequency distribution table
with 8 classes. Also include the following:
a. Class mark
b. Class boundary
c. Relative frequency distribution table
d. Greater than cumulative frequency
distribution table.
e. Less than cumulative frequency
distribution table
II. Determine
a. Class width
b. Lower class boundary of the 3rd class
c. Lowest lower limit
d. No. of teachers younger than 36.5.
e. No. of teachers older than 54.5.
f. No. of teachers younger than 66.5.
g. Upper limit of the 6th class.
h. No. of teachers younger than 72.5.
i. Upper boundary of the 6th class.
j. No. of teachers older than 60.5.
MEASURES OF CENTRAL
TENDENCY
 simple figure which is a representative of the
whole class
 when arranged according to magnitude, it tends
to lie centrally within the set
COMMON MEASURES OF CENTRAL
TENDENCY
A. MODE
 is defined as the value of the term that appears
most frequently
 a distribution may have or may not have a mode
Ex.: Set A : 15, 15, 16, 18, 21
Set B : 15, 16, 17, 20, 19
B. MEDIAN

 Is a value which is at the middle of a distribution


when the distribution is arranged according to
magnitude.
 When the total number of the item is odd, the
median is the middle item. When the total
number is even, there would be two middle
values.
 Get the median by adding the two middle items
and divide the sum by 2

Ex.: A : 15, 20, 18, 20, 17, 21, 14


B : 15, 20, 18, 20, 17, 21
C. MEAN
 the sum of all the items divided by the
total number of items

 Ex.: Consider the prices of certain books:


Php 110, Php115, Php118, Php120, Php124

Mean = 110 + 115 + 118 + 120 + 124 / 5


= Php117.40
WEIGHTED MEAN
 when some values are given importance or it
takes into consideration the proper weights of
the scores according to their relative importance
Ex.: If the final examination in a class in statistics is
given the weight 2, the average of quizzes the
weight 3, a project 1, and a student got the grades
90, 88, 87 respectively, what would be the mean
grade of the student?
 Mean = 2 (90 ) + 3 ( 88 ) + 1 ( 87 ) / 6 = 88.5

 If the grades are unweighted, the student’s mean


grade is
Mean = 90 + 88 + 87 / 3 = 88.33
COMPUTATION OF THE MEAN
FROM GROUPED DATA
 Data which are arranged in a frequency
distribution are called grouped data
 When the number of items is too large, it is
best to compute for the mean presenting the
data in a frequency distribution table
A. LONG METHOD: Mean = ∑ fx / n

Classes f X fx
11 -22 3 16.5 49.5
23 – 34 5 28.5 142.5
35 – 46 11 40.5 445.5
47 – 58 19 52.5 997.5
59 – 70 14 64.5 903.0
71 – 82 6 76.5 459.0
83 – 94 2 88.5 177.0
_______ _________
n = 60 ∑ fx = 3,174

Mean = 3, 174 / 60
= 52.9
B. Coded Deviation Method Mean = Assumed mean + ( ∑ fd / n
)i

Classes f d fd
11 – 22 3 -3 -9
23 – 34 5 -2 -10 ] = - 30
35 – 46 11 -1 -11
47 – 58 19 0 0
59 – 70 14 1 14
71 – 82 6 2 12 ] = 32
83 – 94 2 3 6
___________ ______________
n = 60 ∑ fd = 2

Mean = 52.5 + ( 2 / 60 ) 12
= 52.5 + ( 2 / 5 ) 1
= 52.5 + ( 0.4 )
= 52.5 + 0.4
= 52.9
COMPUTATION OF THE MODE
FROM GROUPED DATA
 the mode in a frequency distribution is found
within the class with the highest frequency
 the computing formula is given by

Mode = LCBmo + ( Δ1 / Δ1 + Δ2 ) i
11 – 22 3
23 – 34 5
35 – 46 11] = 8
47 – 58 19
59 – 70 14 ] = 5
71 – 82 6
83 – 94 2
________
N = 60

Mode = 46.5 + ( 8 / 8 + 5 ) 12
= 46.5 + ( 8 / 13 ) 12
= 46.5 + ( 96 / 13 )
= 46.5 + ( 7.384615385 )
= 46.5 + 7.384615385
= 53.88
COMPUTATION OF THE MEDIAN FOR
GROUPED DATA

 to compute the median from grouped data, we have to


determine first the value which divides the distribution into
two equal parts
 we have to also consider the “less than cumulative frequency”
 computing formula is given by

Median = LCBmd + ( n/2 - < cumfb ) i


____________
Fmd
Classes f <cumf
11 – 22 3 3
23 – 34 5 8
35 – 46 11 19]<cumfb
47 – 58 19] Fmd
59 – 70 14
71 – 82 6
83 – 94 2
________
N = 60

Median = 46.5 + ( 30 - 19 / 19 ) 12
= 46.5 + ( 11 /19 ) 12
= 46.5 + (132 / 19 )
= 46.5 + ( 6.947368421 )
= 46.5 + 6.947368421
= 53.45
QUANTILES
 the quantiles are natural extension of the
median concept in that they are values
which divide a set of data into equal parts
 for all the quantiles, it must be clearly
understood that the item values are also
arranged according to magnitude
 the uses, limitations, computation of the
quantiles are very similar with that of the
median
QUARTILE
 divides the distribution into four (4) equal parts
 These are values Q1, Q2, Q3
 Computing formula is given by

Qk = LCBQk + ( kn/4 - <cumfb ) i


______________
FQk

 Q3 = LCB Q3 + ( 3n/4 - <cumfb ) i


_____________
FQ3
Classes f <cumf
11 – 22 3 3
23 – 34 5 8
35 – 46 11 19
47 – 58 19 38] <cumfb
59 – 70 14] FQ3
71 – 82 6
83 – 94 2
_______
N = 60

Q3 = 58.5 + ( 45 - 38 / 14 ) 12
= 58.5 + ( 7 / 7 ) 6
= 58.5 + ( 1 ) 6
= 58.5 + ( 6 )
= 58.5 + 6
= 64.5
DECILE
 divides the distribution into ten (10) equal parts
 these are D1, D2, D3, …
 computing formula is given by

Dk = LCBDk + ( kn/10 - <cumfb ) i


______________
FDk

D9 = LCBD9 + ( 9n/10 - <cumfb ) i


______________
FD9
Classes f <cumf
11 – 22 3 3
23 – 34 5 8
35 – 46 11 19
47 – 58 19 38
59 – 70 14 52] <cumfb
71 – 82 6 ] FD9
83 – 94 2
______
N = 60
D9 = 70.5 + ( 54 - 52 /6 ) 12
= 70.5 + ( 2 /6 ) 12
= 70.5 + ( 24 /6 )
= 70.5 +4
= 74.5
PERCENTILE
 divides the distribution into one hundred (100)
equal parts
 these are P1, P2, P3, …
 computing formula is given by

Pk = LCBPk + ( Kn/100 - <cumfb ) i


________________
FPk

P89 = LCBP89 + ( 89n/100 - <cumfb ) i


_________________
FP89
Classes f <cumf
11 – 22 3 3
23 – 34 5 8
35 – 46 11 19
47 – 58 19 38
59 – 70 14 52
71 – 82 6
83 – 94 2
_______
N = 60
P89 = 70.5 + ( 53.40 - 52/ 6 ) 12
= 70.5 + ( 1.4 /6 ) 12
= 70.5 + (16.8 /6 )
= 70.5 + ( 2.8 )
= 70.5 + 2.8
= 73.3
Example
Classes f

25 – 30 3
31 – 36 6
37 – 42 11
43 – 48 27
49 – 54 16
55 – 60 7
61 – 66 4
67 – 72 1
_____________
N = 75
Construct the following:

a. classmark or midpoint
b. class boundaries
c. relative frequency distribution
c. less than cumulative frequency distribution
d. greater than cumulative frequency
distribution
Determine:
a. Class size
b. Class boundary of the 3rd class interval
c. Lowest lower limit
d. Highest upper boundary
e. No. of people belonging to the upper boundary 54.5
f. No. of people belonging to the lower boundary 54.5
g. Classmark of the 6th class interval
h. No. of people belonging to the lower boundary 42.5
i. No. of people belonging to the upper boundary 36.5
j. Highest upper class limit
Compute the following:
a. Mean
b. Mode
c. Median
d. 7th decile
e. 3rd quartile
f. 1st quartile
g. 45th percentile
h. 9th percentile
i. 35th percentile
j. 95th percentile
k. 55th percentile
l. 65th percentile
m. 8th decile
n. 3rd decile
o. 85th percentile
MEASURES OF VARIABILITY
 these are measures which describe the
extent of scattering of the individual items
about the average or point of central
location
 the measures of central tendency are of little
value unless the degree of spread or
variability which occurs about the items are
given
 the description of a set of data becomes
more meaningful if the degree of clustering
about the central point is measured
Consider the five sets of
observations:
A: 15, 15, 17, 18, 20
B: 15, 16, 16, 18, 20
C: 14, 15, 16, 19, 21
D: 11, 13, 18, 18, 25
E: 14, 15, 18, 19, 19
COMMON MEASURES OF
VARIABILITY
 Range
 Semi-Interquartile Range
 Mean Absolute Deviation
 Standard Deviation
1. RANGE

 Is the simplest of the measures of spread or


variability.
 It is the difference between the highest and
the lowest items in the distribution.
 In a frequency distribution table, the range is
the difference between the upper limit of the
highest class interval and the lowest limit of
the lowest class interval.
Example
Classes f

11 -22 3
23 -34 5
35 -46 11
47 -58 19
59 -70 14
71 -82 6
83 -94 2
____________
N= 60
SEMI-INTERQUARTILE RANGE
 Sometimes called quartile deviation
 It is the amount of spread between the
first quartile and the median, or the
median and the third quartile.
 The dispersion is in the middle half of the
items arranged in an array.
 The formula used to compute the quartile
deviation is
QD = Q3 – Q1 / 2

 Although the semi-interquartile range is an


improvement of the range in the
approximation of the spread of the values of
the items, still it does not reflect the variability
of the entire set of values of the items.
Example
Classes f <cumf
11- 22 3 3
23- 34 5 8] <cumfb Q1
35- 46 11] fQ1 19
47- 58 19 38] <cumfb Q3
59- 70 14] fQ3 52
71- 82 6 58
83- 94 2 60
________
n = 60
Sol’n:
Q1 = LCBQ1 + ( n/4 - <cumfb ) i
____________
fQ1

= 34.5 + ( 15 – 8 ) 12
______
11
= 34.5 + ( 7/11 ) 12
= 34.5 + ( 84/11 )
= 34.5 + ( 7.63636364 )
= 34.5 + 7.63636364
= 42.14
Q3 = LCBQ3 + ( 3n/4 - <cumfb ) i
_____________
fQ3
= 58.5 + ( 45 – 38 ) 12
________
14
= 58.5 + ( 7/7 )6
= 58.5 + ( 1 ) 6
= 58.5 + 6
= 64.5

QD = Q3 – Q1/2
= 64.5 – 42.14/ 2
= 22.36/2
= 11.18
MEAN ABSOLUTE DEVIATION
 In computing for the mean absolute deviation, we
consider the extent to which each individual score in a
distribution deviates from the mean of that
distribution.
 We subtract the mean from each score to determine
the deviation or the distance of each score from the
mean.
x = X – mean
x = each score’s deviation from the mean
X = particular score
 The formula used to compute the MAD is
MAD = ∑ / X – mean /
__________
N
Example 1: Consider the five scores of the students
in a certain 20-item test: 15, 15, 17, 18, 20

X X – mean / X – mean /
15 15 -17 = -2 2
15 15 -17 = -2 2
17 17 -17 = 0 0
18 18 -17 = 1 1
20 20- 17 = 3 3
____ ____________
∑X=85 ∑ / X – mean / = 8
MAD = ∑ / X –mean /
____________
n

MAD = ∑ / X –mean /
____________
n
= 8/5
=1.6
Example 2
Classes f X X – mean/ X – mean / f/ X – mean /
11 – 22 3 16.5 16.5 – 52.9 36.4 109.2
23 – 34 5 28.5 28.5 – 52.9 24.4 122.0
35 – 46 11 40.5 40.5 – 52.9 12.4 136.4
47 – 58 19 52.5 52.5 – 52.9 0. 4 7.6
59 – 70 14 64.5 64.5 – 52.9 11.6 162.4
71 – 82 6 76.5 76.5 – 52.9 23.6 141.6
83 – 94 2 88.5 88.5 – 52.9 35.6 71.2
____________
∑f/ X – mean / =750.4
STANDARD DEVIATION
 Is a special form of average deviation from the
mean.
 All the individual values of the items in the
distribution are taken into consideration.
 Denoted by s or sd is the positive square root of
the arithmetic mean of the squared deviations
from the mean of the distribution.
 It is important as a measure of heterogeneity or
unevenness within a set of observations.
 If the S of the IQ of say 50 students is numerically
big, then we can say that there is heterogeneity in
their intelligence. If the S is small we can say that
there is homogeneity in their intelligence.
STANDARD DEVIATION FROM
UNGROUPED DATA
Example: Consider the five (5) scores of students in a 20-item test: 15, 15, 17, 18, 20.

X X – mean ( X – mean ) ( X – mean )2


15 15 – 17 -2 4
15 15 – 17 -2 4
17 17 – 17 0 0
18 18 – 17 1 1
20 20 – 17 3 9
_____________
∑ ( X – mean )2 = 18

S = √ ∑( X – mean )2/ n
= √ 18 / 5
= √3.6
= 1.8973666
= 1.9
Note
 There are occasions when the formula
and procedure for application stated
above is inconvenient to use for
computation.
 This is especially true when the deviations
from the mean are not simple round
numbers and they are not most of the
time.
 The formula used to compute is given by
 S = √ ∑x2/ n – ( ∑x/n )2
Use the same problem as in the first
computation.
X X2
15 225
15 225
17 289
18 324
20 400
___ _______
85 1,463
S = √1,463/5 – (85/5)2
= √292.6 – ( 17)2
= √292.6 – 289
= √3.6
= 1.9
COMPUTATION OF THE
STANDARD DEVIATION FROM
GROUPED DATA
 There are three ways of computing the
Standard from grouped data.
 The computation is essentially the same
as that of the ungrouped data except that
X is NOT the value of the item, but
rather the class mark for each of the class
intervals.
 Another point of difference is the use of
the “f” as a factor in the formula.
LONG METHOD A
Classes f X X – mean ( X – mean )2 f(X – mean )2

11 – 22 3 16.5 16.5 – 52.9 1324.96 3974.88


23 – 34 5 28.5 28.5 – 52.9 595.36 2976.8
35 – 46 11 40.5 40.5 – 52.9 153.76 1691.36
47 – 58 19 52.5 52.5 – 52.9 0.16 3.04
59 – 70 14 64.5 64.5 – 52.9 134.56 1883.84
71 – 82 6 76.5 76.5 – 52.9 556.96 3341.76
83 – 94 2 88.5 88.5 – 52.9 1267.36 2534.72

______________

16,406.4
 S = √∑f( X – mean )2
____________
n-1
= √16,406.4
________
60-1
= √278.0745763/
= 16.68
LONG METHOD B
Classes f X fx X2 fx2
11 – 22 3 16.5 49.5 272.25 816.75
23 – 34 5 28.5 142.5 812.25 4061.2

35 – 46 11 40.5 445.5 1640.25 18042.75


47 – 58 19 52.5 997.5 2756.25 52368.75
59 – 70 14 64.5 903 4160.25 58243.5
71 – 82 6 76.5 459 5852.25 35113.5
83 – 94 2 88.5 177 7832.25 15664.5
_________ ______________
3174 184311
S = √ ∑fx2 - ( ∑fx )2
______ _________
n – 1 n ( n – 1)
= √184311 - ( 3174 )2
_____ ________
60 – 1 60 ( 60 – 1)

= √184311 - 10,074,276
_____ _________
59 60 ( 59 )

= √3123.9152542 - 10,074,276
_________
3540

= √3123.9152542 - 2845.840678
= √278.0745762
= 16.68
CODED DEVIATION METHOD
Classes f d fd d2 fd2
11 – 22 3 -3 -9 9 27
23 – 34 5 -2 -10 4 20
35 – 46 11 -1 -11] -30 1 11
47 – 58 19 0
59 – 70 14 1 14 1 14
71 – 82 6 2 12 4 24
83 – 94 2 3 6] 32 9 18
_____ _______
2 114
S = i √∑fd2 - ( ∑fd )2
_____ ______
n–1 n (n -1)
= 12 √ 114 - ( 2 )2
______ ______
60 -1 60(60 -1)
= 12 √ 114 - 4
_____ _______
59 60( 59 )
= 12 √ 1.93220339 - 0.0011299435
= 12 √1.93107345
= 12 ( 1.38963069 )
= 16.68
Illustration: The following are the
scores obtained by the 25
students who took a 40-item
test;
25, 24, 30, 27, 28, 15, 17, 18,25,
35, 22, 31, 30, 23, 32, 16, 26,33
27, 20, 21, 28, 23, 34, 37
Compute for:
1. Mean
2. Proficiency Level
3. Standard Deviation
4. Make an interpretation about
the computed statistical
measures.
Solution
A. Freq. Dist.
f
15 – 18 4
19 – 22 3
23 – 26 6
27 – 30 6
31 – 34 4
35 – 38 2
n= 25
B. Mean
Classes f x fx
15 – 18 4 16.5 66
19 – 22 3 20.5 61.5
23 – 26 6 24.5 147.0
27 – 30 6 28.5 171.0
31 – 34 4 32.5 130.0
35 – 38 2 36.5 73.0
____ ______
n = 25 ∑fx = 648.5
Mean = ∑ fx/ n
= 648.5/25
= 25.94
C. Proficiency Level
PL = Mean/No. of Items X 100%
= 25.94/ 40 X 100%
= 64.85%
D. Standard Deviation
S = i √∑fd2 - ( ∑fd )2
_____ ______
n–1 n (n -1)
Classes f d fd d2 fd2
15 – 18 4 -2 -8 4 16
19 – 22 3 -1 -3 1 3
23 – 26 6 0
27 – 30 6 1 6 1 6
31 – 34 4 2 8 4 16
35 – 38 2 3 6 9 18
___ ___ ____
n = 25 ∑fd=9 ∑fd2= 59
SD = 4 √ 59 - ( 9 ) 2
25 -1 25(25 –1)
= 4 √ 59 - 81
24 25(24)
= 4 √ 2.45833333 - 81
600
= 4 √ 2.45833333 – 0.135
= 4 √ 2.3233333
= 4 (1.52424844)
= 6.09699376 or 6.10 heterogeneous