You are on page 1of 32

CHAPTER 9:

VALIDITY AND RELIABILITY


OF CLASSROOM TEST

52

LEARNING EXERCISES:
1. In both reliability and validity analysis, correlation coefficient was used.
Do a further research on correlation coefficient and answer the
following:
a. Define correlation coefficient when used as validity coefficient
b. Define correlation coefficient when used as reliability coefficient
c. Enumerate the factors that affects correlation coefficients in both
reliability and validity analysis.

Correlation coefficient is the relationship of two variables, when we use it as


validity coefficient it means that the measure of different set of test has a connection to
other test and that connection should be analyze and interpret to get the correlation
coefficient of test validity, we know that validity measure what is supposed to measure.
And validity coefficient can be done between 2 test or group
In reliability it is the consistency of the test measure. Reliability coefficient is the
relationship of test result it will be analyze and interpret.
There are some factors that can affect the validity and reliability coefficient like
the kind of takers slow, mid and fast learners, the difficulty of test, test contains and test
result.

2. Suppose you are already a teacher and a fellow teacher handling the
same subject told you that he has already prepared a set of tests.
Discuss the effect on the validity of the test in your class if you are to
adopt this tests. Identify the particular type of validity which would
greatly be affected and discuss the reason why.
There are factors that can affect the validity of test questioners like:

Content Validity the test may contain topics that you may not yet discuss
to your students
Construct validity It may contains problems that can affect students
understanding.

53

3. Suppose you have administered a 50-item test in the previous school


year. Now, your students took a 100-item standardized test with similar
competencies being assessed. Based on the available data below,
identify the type of validity that can be established. Compute for the
validity coefficient and provide interpretation.
Scores in the Teacher
Made Test
44
35
35
40
38
37
39
37
32
st
nd
41the
1 Time the
2 Time
test was
test was
administered
administered
(X)
(Y)
45
83

XY

Scores on the
Standardized Test
87
85
84
85
83
84
88
87
80
89
X

39984

2025

6889

44

72

2736

1444

5184

41

63

2142

1156

3969

40

65

2600

1600

4225

39

70

3080

1936

4900

38

68

2040

900

4624

38

82

3198

1521

6724

34

78

2964

1444

6084

32

75

3075

1681

5625

30

84

2688

1024

7056

X= 740

Y= 28517

X= 14731

Y= 55280

Total: 381
FORMULA:

54

N Y ( Y ) }
[N x 2( X ) ]
N XY ( X )( Y )
r=

r= 10 (28517)-(381)(740)
[10(14731)-(381) [10(55280)-(740)]
=285170-281940
[147310-145161] [552800-547600]
=3230__
[2149][5200]
=3230__
11174800
=3230__
3342.873016

= 0.96 the relationship is positive but less than perfect. Because +1.00 is the
perfect positive relationship, it means that the 50-item teacher made test and 100-item
standardized test shows correct validity and reliability.

55

4. Compare and contrast the three ways of establishing reliability using a


Venn diagram.

Test- Retest Method


=

Equivalent-form

In this method the test

Method

Can be administered

= in this method it gets the

twice in the Same group

equivalent of the score of

of student.

different test that is administered


in diff. group of student to
know the coefficient relationship of test scores

This methods is use to know the


Consistency of test score. It can be
Administered to the same group of
Student.

= Internal- Consistency Method


In this method test administered only once,
It gets the half correct answer of the test items
if it is consistent to other half.

56

5. Discuss how composing a long test helps improve the reliability of a


test.

Composing a long test can improve the reliability of the test because it has more
contents that can measures(score) the students knowledge in all the topic given even if
the test has a high number of contents
.

6. Using the scores obtained from administering the periodical/quarterly


test you conducted, determine the internal consistency of the test using
the Split-half method. Discuss your findings.

57

REFLECTION

Topic:
Validity and Reliability of Classroom Test

Establishing Validity
Establishing Reliability

In this chapter discused about the things we have to remember in making


and administering a test. F.irst is the validity it refers to the test contents. Validity
may affect by certain factors ( appropriateness of test items, unclear direction,
level and vocabulary difficulty, etc.) that may affect test reliability and give
confusion to the students.
Second is the Reliability it is the consistency of the scores obtained by the
students it is administered once or twice to know if the student really learned
from the discussion and if the teacher can really rely on the students learning.
Reliability can be done by various method like test-retest, equivalent-forms and
internal-consistency method.

58

CHAPTER 10:
SUMMARIZING AND
INTERPRETING ASSESSMENT
DATA: ORGANIZATION AND
PRESENTATION

59

LEARNING EXERCISES:

1. Enumerate and discuss the important characteristics and assessment data


that can be better understood through the use of a histogram.

Histogram can be used easily the data can be seen clearly and orderly.
You can determine the frequencies(scores) and the number of student easily
without reading anymore.
You can also determine scores visually because it uses bars or lines that is
plotted in the scores of student.
It can easily determine if your score is valid and reliable.
Histogram can be used in collecting scores, to interpret what is the difficulty of
the test. To know who got the highest scores and lower score.

2. Given that both the histogram and frequency polygon are essentially graphic
presentation of the same assessment data found in a tabular form, identify
and discuss the major advantage of graphic presentations over tabular
presentations.

Graphic presentation is a diagram that is more easy to understand and


analyze because it only shows the summary of collected data. It easily shows
frequencies and also the comparison among them.

60

3. Research on how computer software can be used to aid in the organization


and presentation of assessment data. Make a presentation on the procedures
and steps in using computer software (e.g. MS Excel) to organize and
graphically present assessment data.

MS Excel is one of the example computer software that teachers can


use in assessment data. It compile a data (score) in a spreedsheet and
also can get the computation of it in just one click.

61

4. Using the scores from administering the periodical/quarterly test you


constructed:
a. Prepare a tally sheet.
b. Construct a frequency distribution table which includes the following:
1) Class interval
2) Class boundaries

4) Frequency for each class


intervals
5) Relative frequency
percentage

and

3) Class mark
6) Both > and < cumulative
frequency

62

REFLECTION

Topic:
SUMMARIZING AND INTERPRETING ASSESSMENT
DATA: ORGANIZATION AND PRESENTATION

Organizing Assessment Date

Graphical Presentation of Data

Tabular Presentation of Assessment Data: Constructing a Frequency


Distribution

In this chapter we discuss on how we can tabulate data using


graphical presentation. With the use of graphical presentation we can
easily analyze and interpret the data like how many students got lowest
and highest score, in the administered test what group of student is who
got more lower and higher score. One example is histogram is a
graphical presentation using bars, it includes the score frequency and
number of student.

63

CHAPTER 11:
SUMMARIZING AND
INTERPRETING
ASSESSMENT DATA:
MEASURES OF CENTRAL
TENDENCY AND
VARIABILITY

64

LEARNING EXERCISES:
1. In what sense are the mean, median, and the mode measures of
center? Explain your answer comprehensively.
Because the mean, median and mode is the first data we need to
distinguish and this data can also be used in the entire process of
collecting data of variability. And also mean and median is what we need
to get the frequency distribution. It only means that those 3(mean, median
and mode) is the center of assessment data.

2. An Introduction to Computer class consists of 30 students, all of


whom are achiever in the Final Examination. The class also
includes two students whose scores are extremely low. Which
does a better job of describing the scores of a typical student in
the class of 32 students (including the two students): the mean or
the median? Justify your answer.
The better way to describe the score of 32 students is by getting
the median, because median is the center of all the score that each
student has. And also the median shows the separation of scores of
highest to lowest which it is easy to know the interval of score and how
many students pass the test. For example highest score is 38 out of 40
and the lowest score is 25 with the median of 35 it only means that many
student got pass on the examination because we based on the median.

3. Discuss the importance of the measures of variability.


Measuring variability is important because in this we will know how
to disperse the data and collect them to have a clearer presentation of
data (skewness). With the use of measuring variability we can get the
difference of test scores.

65

4. Choose which would have more variation in a periodical test: the


scores of students in an elective class or the scores of students
in a regular class. Explain your answer comprehensively.

There are more variation in an elective class since it is an elective


class the subjects are only limited and because of that they have more
time to discuss their topic and more learnings are coming and more test
items can be done. Thats why it has more variance than the other.
.
5. Using the scores obtained from pilot testing the
periodical/quarterly test your constructed, determine and
interpret the following measures:
a. Mean
36 33 33 32 32 32 31 30 30 30
29 29 28 27 27 26 26 26 26 25
25 24 23 22 22 21 19 18 17 14
X=x/N
X=793/30
X=26.43
b. Median
Mdn=N+1/2
Mdn=30+1/2
Mdn=15.5 = 26.5
c. Mode
Mo=26
d. Range
Range= Highest Score Lowest Score
Range= 36-14
Range= 22
e. Mean Deviation
MD= |X-X|/N
MD= 120.25/30
MD= 4.03
f. Variance
S2= (X-X) 2 /N
S2= 501.36/30
S2= 16.71
g. Standard Deviation

66

b. S=(X-X)2/N
c. S=16.71
d. S= 4.09
Score X
36
33
33
32
32
32
31
30
30
30
29
29
28
27
27
26
26
26
26
25
25
24
23
22
22
21
19
18
17
14

Mean X
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43
26.43

|x-x|
9.57
6.57
6.57
5.57
5.57
5.57
4.57
3.57
3.57
3.57
2.57
2.57
1.57
0.57
0.57
0.43
0.43
0.43
0.43
1.43
1.43
2.43
3.43
4.43
4.43
5.43
7.43
8.43
9.43
12.43
120.75

|x-x|2
91.58
43.16
43.16
31.02
31.02
31.02
20.88
12.74
12.74
12.74
6.60
6.60
2.46
0.32
0.32
0.18
0.18
0.18
0.18
2.04
2.04
5.90
5.90
19.62
19.62
29.48
55.20
71.06
88.92
154.50
501.36

67

REFLECTION

Topic:
SUMMARIZING AND INTERPRETING
ASSESSMENT DATA: MEASURES OF CENTRAL
TENDENCY AND VARIABILITY

Measures of Central Tendency


Obtaining Measures of Central Tendency from a Grouped Frequency
Distributions
In this chapter discuss about the central tendency and measuring
of variability. Central Tendency are mode which is the score of most
students, the median which is the center of scores where we can tell if
more students passed or failed the exam, and lastly the mean which is the
sum of all score and divided into the number of students. And mean is
usually used in getting the measure the variability such as mean
deviation, variance, standard deviation and coefficient of variation. That is
needed to complete the assessment data.

68

CHAPTER 12:
SUMMARIZING AND
INTERPRETING
ASSESSMENT DATA:
MEASURES OF SHAPE AND
LOCATION

LEARNING EXERCISES:

69

1. Below are graphical presentations of the test scores of five group


of students:

a. Based on visual inspection/observation, describe each


distribution in terms of skewness and kurtosis.
b. Based on the shape of the distributions, discuss the test
performance of the students in each group.
Group 1- Based on the graph the group 1 the curve is a normal platykurtic, the
skewness shows symmetrical curve distribution which means the test score is
positive(valid and reliable). Based on the skewness, Group 1 has a symmetrical
score means students got an equal score high but not than low.
Group 2- Group 2 also have a platykurtic curve, the group 2 curve shows
positively distribution because the more students got a passing score. The
skewed is to the left means the mode is higher. The curve means many students
pass the exam
Group 3- Group 3 have platykurtic curve, the group 3 curve shows negative
distribution score. The skewed is to the right means fewer students got a passing
score. It means that most students failed the exam.
Group 4- The Group 4 also has failing grades their data is not enough to make
the test reliable because the mode. median and mean are the same. It means
that the score is very low to make a curve.
Group 5- Group 5 has a leptokurtic, highly skewed and symmetrical it means
many student got a perfect or almost perfect scores.

70

c. If the graphical presentations were obtained from five groups


who took the same test. Which do you think performed
relatively better? Explain you answer comprehensively.
Based on the graphical presentation above the group who perform
better are group1, 2 and 5 because those groups skewness shows that
they all got a passing scores. The scores are positively distributed in the
data.
1. Using the scores obtained from administering the
periodical/quarterly test you constructed, determine the following:
a. Skewness
sk2=3(mean-median)/s
sk2=3(26.43-26.5)/4.09
sk2=-0.21/4.09
sk2=-0.05
b. Kurtosis
Kurt1=m4/m22
Kurt1=57723.52/251361.85
Kurt1= 0.23
c. 5th, 15th, 50th, and 88th percentiles
36 33 33 32 32 32 31 30 30 30
29 29 28 27 27 26 26 26 26 25
25 24 23 22 22 21 19 18 17 14
P5
P5=nk/100
P5=30(5)/100=1.5
P5=17
P15
P15= nk/100
P15= 30(15)/100
P15= 4.5
P15= 21
P50
P50= nk/100
P50= 30(50)/100
P50= 15
P50= 26+27/2
P50= 26.5
P88

71

P88= nk/100
P88= 30(88)/100
P88= 26.4 or 24
P88= 31+32/2
P88= 31.5
d. Percentile Ranks of each Students Score
Score
36
33
33
32
32
32
31
30
30
30
29
29
28
27
27
26
26
26
26
25
25
24
23
22
22
21
19
18
17
14

Rank
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

|x-x|
100
96.67
93.33
90
86.67
83.33
80
76.67
73.33
70
66.67
63.33
60
56.67
53.33
50
46.67
43.33
40
36.67
33.33
30
26.67
23.33
20
16.67
13.33
10
6.67
3.33
120.75

e. 1st, 5th, and 8th deciles


D1
D1= nk/10
D1= 30(1)/10

72

D1= 3
D1= 18+19/2
D1= 18.5
D5
D5= nk/10
D5= 3(5)/10
D5= 1.5
D5= 14
D8
D8= nk/10
D8= 3(8)/10
D8= 2.4
D8= 17+18/10
D8= 3.5
f. 1st, 2nd, and 3rd Quartiles
Q1
Q1= nk/4
Q1= 30(1)/4
Q1= 7.5
Q1= 23
Q2
Q2= nk/4
Q2= 30(2)/4
Q2= 15
Q2= 26+27/2
Q2= 26.5
Q3
Q3= nk/4
Q3= 30(3)/4
Q3= 22.5
Q3= 30

73

REFLECTION
Topic:
Summarizing and Interpreting Assessment Data:
Measures of Shape and Location

Measures of Shape
Measures of Location

In this chapter discuss about on how to distribute the standard score, it shows
the different kind of curve and its skewness based on the central of tendency of
the test scores and how can we distribute/show it in graphical presentation. In
this chapter we also discuss how to measure the location of test scores in
percentile, decile and quartile. Where in finding these locations we can
determine the ranking of students in a class or in a group it also measures the
performance of a student.

74

CHAPTER 13:
SUMMARIZING AND
INTERPRETING
ASSESSMENT DATA: THE
STANDARD NORMAL
DISTRIBUTION AND
STANDARD SCORES

LEARNING EXERCISES:

1. A set of scores (n=100) was obtained from a cohort of students who


took a particular test. If the distribution is said to be bell-shaped and
was already transformed into a standard normal distribution, on the
average, how many scores falls between +1 and -1 z score

75

With the given Standard Normal Distribution of 34.13%,1000 multiply to


34.13% it is equal to 341.13 multiply it into two, the final answer should be 682.6
Meaning there will be 682.6 will fall between +1 and -1 z scores.

2.

A score from a group of scores obtained through


administering a large-scale test of ICT competencies is found to be
equal to a z score of -2. Is the value above the mean or below the
mean? How many standard deviations away from the mean is the
value?
The value is above the mean, there will two standard deviation away
because standard deviation is -1 since there is -2 the range should be 2
also.

3.

Which is relatively better: a score of 85 on a


programming test or a score of 45 in a mathematics test? Scores
on the programming test have a mean of 90 and a standard
deviation of 10. Scores on the mathematics test have a mean of 55
and a standard deviation of 50.

The programming test is much better because in the average of


mean (90) the way/range is by 10.

4.

Three students take equivalent achievement tests.


Which is the highest relative score? Explain your answer.
a. A score of 144 on a test with a mean of 128 and a standard
deviation of 34.
b. A score of 90 on a test with a mean of 86 and a standard
deviation of 18.
c. A score of 18 on a test with a mean of 15 and a standard
deviation of 5.
The student who got a highest score is A because with the score of
144 and the mean of 128 . the score of student A deviates by the mean
around 34.
76

5. For a standard normal distribution, find the percentage of scores that


are:
a. Within the 1 standard deviation of the mean
34.13%
b. Within the 1.96 standard deviation of the mean.
47.50%
c. Between the mean of -3 standard deviation
49.87%

d. Between the mean of +3 standard deviation


49.87%
e. Between 1 standard
10z
Z50
deviation below the
Z-SCORE
-X
/S
score
mean
and 2 standard
deviations above the
mean.
81.85%

Tscore

6. Using the z scores obtained from


administering the periodical /
quarterly test you constructed,
determine the following:
a. Equivalent Z-score
b. Equivalent T-score

77

Range of
Percentile
Ranks

Belo
w4

4-10

11-22

23-39 40-59 60-76 77-88 89-95

Percentage
3%
6%
11%
16%
of Cases
Stanine
1
2
3
4
c. Score that fall under each stanine

96 or
higher

20%

16%

11%

6%

3%

78

REFLECTION
Topic:
Summarizing and Interpreting Assessment Data: The Standard
Normal Distribution and Standard Scores

The Standard Normal Distribution


Standard Scores
Standard Scores vis--vis Test Performance Interpretation

In this chapter discuss about Standard Normal Distribution meaning the teachers will
transform it my placing the mean to 0 and the standard deviation to 1. In this chapter we
also learn how to get the z-score and the t-score and stanine it is done to get equivalent
of score that is for the performance average of the student.
In this chapter we also discuss that test performance can be assess in two ways the
norm-reference and criterion- reference that was also discuss in the previous topic.

63

CHAPTER 14:
GRADING, MARKING,
REPORTING SYSTEMS

64

LEARNING EXERCISES:

1. Four principles of grading were discussed in this chapter. Discuss the


implication of not adhering to these principles.
Principles of grading serve as guidelines for teacher and for the aspirant to have
their own school. Thats why not adhering this principles may cause conflict not just to
student also to teachers, parent and other school officials when it comes in the average
performance of the student. Especially the principle 3 if the grading system of a teacher
is personal and (s)he may change the grade because she doesnt like the student that is
a ground for bullying. So we always have to remember this principles as a future
teacher.

2. The assumption underlying the criterion-references grading system is that


there is an absolute quantity of whatever is being measured and the grade
reflects how much of that quantity each student has. Do you agree with
this view? Why or why not? Explain your answer.

I agree because the students grade or average should be based on what she learn it
should be based on the assessment, performance task and behavior to be fair with
others

3. If a school is claiming the implement a criterion-referenced grading system


but standards are not clearly defines, what possible effects would it have
on grading practices as well as the grades itself? Explain your answer.

The teachers, instructors or mentors may have confusion on how to give a grade, it
may cause different grading standard because there is no certain standard implemented
each teacher have to make their own standard in grading which is unfair..

65

4. Compare and contrast the present (DepEd Order No. 8, s. 2015) and the
previous grading system (DepEd Order No. 73, s. 2012) implemented for the
K to 12 Basic Education Program.
In the 2012 grading system, there is only 9 subject and it is more on cognitive
target, In 2015 since there is already a k-12 curriculum system grading become more
high in standard because our education today is more on international based.

5. Describe the Grading, Marking and Reporting System in the K to 12


curriculums. Focus on the following guide questions.
a. Does it utilize criterion-reference or norm-referenced grading?
Explain.
b. Does it utilize averaging or cumulative grading?
c. What kind of marking system does it uses?
d. How are performance reported to the parents, students and other
stakeholders?

In K-12, grading system become higher in standard because of its international


curriculum based. The Grading, Marking and Reporting is both base in norm and
criterion because student nowadays especially in those in k-12 program should be
expose in all kind of learning. The grading system is also base on the the performance
or out-come base performance. And k-12 curriculum uses letter-grade system. The
performance of the student reported through report card where student, parent and the
teacher can give feedbacks to each other.

66

REFLECTION
Topic:
Grading, Marking and Reporting Systems

Grading Systems
Marking Systems
Method of Reporting
Overview of Classroom Assessment and Grading System for K to 12 Basic
Education Program: Grades 1 to 12

In this chapter discuss about the grading system in k-12 curriculum, we discussed
the principles of grading that serves as a guidelines for teachers there should be a
proper criteria. And also grading is either norm or criterion grading. There is also
different grading system like pass-fail system, numerical system and lastly the system
we usually use in k-12 curriculum..

67

You might also like