You are on page 1of 2

The reliability of a measure indicates the extent to which it is without bias (error free) and hence ensures consistent

measurement across time and across the various items in the instrument. In other words, the reliability of a measure is an indication of the stability and consistency with which the instrument measures the concept and helps to assess the goodness of a measure. There are two different type of means of that is stability of measures and internal consistency of measures. Two test of stability are test-retest reliability and parallel-form reliability. The reliability coefficient obtained by repetition of the same measure on a second occasion is called the test-retest reliability. It can be estimate test-retest reliability when we administer the same test to the same sample on two different occasions. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical. We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation. Reliability can vary with the many factors that affect how a person responds to the test, including their mood, interruptions, time of day, etc. A good test will largely cope with such factors and give relatively little variation. An unreliable test is highly sensitive to such factors and will give widely varying results, even if the person re-takes the same test half an hour later. The problem with test-retest is that people may have learned and that the second test is likely to give different results. This method is particularly used in experiments that use a no-treatment control group that is measure pre-test and post-test. In parallel forms reliability you first have to create two parallel forms. One way to accomplish this is to create a large set of questions that address the same construct and then randomly divide the questions into two sets. You administer both instruments to the same sample of people. The correlation between the two parallel forms is the estimate of reliability. One major problem with this approach is that you have to be able to generate lots of items that reflect the same construct. This is often no easy feat. Furthermore, this approach makes the assumption that the randomly divided halves are parallel or equivalent. Even by chance this will sometimes not be the case. When multiple people are giving assessments of some kind or are the subjects of some test, then similar people should lead to the same resulting scores. It can be used to

calibrate people, for example those being used as observers in an experiment. Inter-rater reliability thus evaluates reliability across different people. Two major ways in which interrater reliability is used are (a) testing how similarly people categorize items, and (b) how similarly people score items. This is the best way of assessing reliability when you are using observation, as observer bias very easily creeps in. It does, however, assume you have multiple observers, which is not always the case. Inter-rater reliability is also known as interobserver reliability or inter-coder reliability. In internal consistency measurement the estimation use single measurement instrument administered to a group of people on one occasion to estimate reliability. In effect to judge the reliability of the instrument by estimating how well the items that reflect the same construct yield similar results. We are looking at how consistent the results are for different items for the same construct within the measure. The interitem consistence reliability can be divided into two type that is average inter-item correlation which compares correlations between all pairs of questions that test the same construct by calculating the mean of all paired correlations and average item total correlation takes the average inter-item correlations and calculates a total score for each item, then averages these. The most popular test of interitem consistency reliability is Cronbachs coefficient alpha which is used for multipoint-scaled item. Split-half reliability reflects the correlations between two halves of an instrument. The estimates will vary depending on how the items in the measure are split into two halves. Split-half reliabilities may be higher than Cronbachs alpha only in the circumstance of there being more than one underlying response dimension tapped by the measure and when certain other conditions are met as well.

You might also like