Professional Documents
Culture Documents
Criterion validity
Criterion validity evidence involves the correlation between the test and a criterion
variable (or variables) taken as representative of the construct. In other words, it
compares the test with other measures or outcomes (the criteria) already held to be valid.
For example, employee selection tests are often validated against measures of job
performance (the criterion), and IQ tests are often validated against measures of academic
performance (the criterion).
If the test data and criterion data are collected at the same time, this is referred to as
concurrent validity evidence. If the test data is collected first in order to predict criterion
data collected at a later point in time, then this is referred to as predictive validity
evidence.
Predictive validity
From Wikipedia, the free encyclopedia
Jump to: navigation, search
In psychometrics, predictive validity is the extent to which a score on a scale or test
predicts scores on some criterion measure.[1]
For example, the validity of a cognitive test for job performance is the correlation
between test scores and, for example, supervisor performance ratings. Such a cognitive
test would have predictive validity if the observed correlation were statistically
significant.
Predictive validity shares similarities with concurrent validity in that both are generally
measured as correlations between a test and some criterion measure. In a study of
concurrent validity the test is administered at the same time as the criterion is collected.
This is a common method of developing validity evidence for employment tests: A test is
administered to incumbent employees, then a rating of those employees' job performance
is obtained (often, as noted above, in the form of a supervisor rating). Note the possibility
for restriction of range both in test scores and performance scores: The incumbent
employees are likely to be a more homogeneous and higher performing group than the
applicant pool at large.
In a study of predictive validity, the test scores are collected first; then at some later time
the criterion measure is collected. Here the example is slightly different: Tests are
administered, perhaps to job applicants, and then after those individuals work in the job
for a year, their test scores are correlated with their first year job performance scores.
Another relevant example is SAT scores: These are validated by collecting the scores
during the examinee's senior year and high school and then waiting a year (or more) to
correlate the scores with their first year college grade point average. Thus predictive
validity provides somewhat more useful data about test validity because it has greater
fidelity to the real situation in which the test will be used. After all, most tests are
administered to find out something about future behavior.
As with many aspects of social science, the magnitude of the correlations obtained from
predictive validity studies is usually not high. A typical predictive validity for an
employment test might obtain a correlation in the neighborhood of r=.35. Higher values
are occasionally seen and lower values are very common. Nonetheless the utility (that is
the benefit obtained by making decisions using the test) provided by a test with a
correlation of .35 can be quite substantial.
Content validity
A test has content validity built into it by careful selection of which items to include
(Anastasi & Urbina, 1997). Items are chosen so that they comply with the test
specification which is drawn up through a thorough examination of the subject domain.
Foxcraft et al. (2004, p. 49) note that by using a panel of experts to review the test
specifications and the selection of items the content validity of a test can be improved.
The experts will be able to review the items and comment on whether the items cover a
representative sample of the behaviour domain.
Representation validity, also known as translation validity, is about the extent to which an
abstract theoretical construct can be turned into a specific practical test.
Face validity is very closely related to content validity. While content validity depends on
a theoretical basis for assuming if a test is assessing all domains of a certain criterion (e.g.
does assessing addition skills yield in a good measure for mathematical skills? - To
answer this you have to know, what different kinds of arithmetic skills mathematical
skills include ) face validity relates to whether a test appears to be a good measure or not.
This judgment is made on the "face" of the test, thus it can also be judged by the amateur.
Face validity is a starting point, but should NEVER be assumed to be provably valid for
any given purpose, as the "experts have been wrong before--the Malleus Malificarum
(Hammer of Witches) had no support for its conclusions other than the self-imagined
competence of two "experts" in "witchcraft detection," yet it was used as a "test" to
condemn and burn at the stake perhaps 100,000 women as "witches."
http://davidmlane.com/hyperstat/A34739.
html
Pearson's Correlation (1 of 3)
The correlation between two variables reflects the degree to which the variables are
related. The most common measure of correlation is the Pearson Product Moment
Correlation (called Pearson's correlation for short). When measured in a population the
Pearson Product Moment correlation is designated by the Greek letter rho (ρ). When
computed in a sample, it is designated by the letter "r" and is sometimes called "Pearson's
r." Pearson's correlation reflects the degree of linear relationship between two variables.
It ranges from +1 to -1. A correlation of +1 means that there is a perfect positive linear
relationship between variables. The scatterplot shown on this page depicts such a
relationship. It is a positive relationship because high scores on the X-axis are associated
with high scores on the Y-axis.
Pearson's Correlation (2 of 3)
A correlation of -1 means that there is a perfect negative linear relationship between
variables. The scatterplot shown below depicts a negative relationship. It is a negative
relationship because high scores on the X-axis are associated with low scores on the Y-
axis.
A correlation of 0 means there is no linear relationship between the two variables. The
second graph shows a Pearson correlation of 0.
Correlations are rarely if ever 0, 1, or -1. Some real data showing a moderately high
correlation are shown
Pearson's Correlation (3 of 3)
Next section: Computational formula
The scatterplot below shows arm strength as a function of grip strength for 147 people
working in physically-demanding jobs (click here for details about the study). The plot
reveals a strong positive relationship. The value of Pearson's correlation is 0.63.
Other information about Pearson's correlation can be obtained by clicking one of the
following links:
• Computational formula
• Sampling distribution
• Confidence interval
• Confidence interval on difference between r's