You are on page 1of 59

Chapter 10

Measurement of
Variables
Bibliography
Research Methods for Business (Uma Sekaran)
VU Book of BRM
Internet

Resource Person: Furqan-ul-haq Siddiqui


1
In everyday usage, measurement occurs when an established
yardstick verifies the height, weight, or another feature of a
physical object.
In literal meanings: measurement is to discover the extent,
dimensions, quantity, or capacity of something, especially by 2
comparison with a standard.
Measurement
Is the process of assigning numbers or labels to
objects, persons, states of nature, or events.
Done according to set of rules that reflect qualities
or quantities of what is being measured.

Measurement in Research
Researches often attempt to measure the
extent or quantity the variables.
Use some existing yardstick, standard or
develop your own.
3
How to Measure Variables?
Objective data
E.g. weight, absenteeism, temperature
you are studying people who attend an auto show where all
years new models are on display. You are interested in learning
the male-to female ratio among attendees.
Use appropriate measuring instruments
Subjective data
E.g. feelings, attitudes, perceptions
motivation, ability to stand stress, problem solving
ability, and persuasiveness.
Conceptualize & Operationalise the
concept
4
a. Conceptualization
The process of identifying and clarifying concepts;
through which we specify what we mean by using
certain terms. It is the process of taking a construct and
refining it by giving it a conceptual or theoretical
definition.
We want to speak of abstract things intelligence; ability to
cope with stress; life satisfaction; happiness.
We cannot research these things until we know exactly what they
are.
Everyday language often vague and unspecified
meanings. Conceptualization is to specify exactly what
we mean and dont mean by the terms we use in our
research.
5
Steps in Reaching a Measurement of a Variable
We have conceptions of what we understand by e.g.
compassion, prejudice, poverty, etc. Often we differ
somewhat in what we understand by these terms.
begin by asking people to describe what they mean when
they use certain terms, e.g. intelligence for a LAY
understanding.
Consult the EXPERTS, and this is why the literature in a
field is so important. But even the experts do not agree on a
single definition.
Coming to an agreement on what we understand is called
conceptualization.
The result of this process is a Construct, e.g. prejudice.
(Sometimes people use the term concept in place of
6
construct)
Teacher Morale..!
What is morale?
Is it a variable?
Develop a conceptual definition.
Look at everyday understanding of morale. How
people feel about things?
Look in the dictionary: confidence, spirit, zeal,
mental condition toward something.
Look into review of literature

7
Morale involves a feeling toward something else; a
person has morale with regard to something.
somethings

Some things toward which teachers have feelings


Some things could be:

Students, parents, pay, the school administration,


other teachers, the profession of teaching.

8
Dimensions of construct
Are there several kinds of teacher morale or are all
these somethings different aspects of one construct
(morale)?
We have to decide whether morale means a single,
general feeling with different parts or dimensions,
or several distinct feelings.
What unit of analysis does our construct apply to: a
group or an individual? Is morale a characteristic of
an individual, of a group, or of both?
Also who is a teacher? 9
b. Operationalization
Specifying exactly what we are going to observe,
and how we will do it. Turn your variable into a
directly measurable thing
Linking conceptual definition to a specific set of
measurement procedures.
Specifies what the researcher must do to measure
the concept under investigation
What specific activities to be undertaken for
measuring the concept?
Look at the behavioral dimensions, translate into
observable elements, ask questions, and develop
index of measurement. 10
Operational Definition:
Dimensions and Elements
Let us operationalize Job Satisfaction
First define it conceptually. Like:
Employees feelings toward their job.
Degree of satisfaction that individuals obtain from
various roles they play in an organization.
A pleasurable or positive emotional feeling resulting
from the appraisal of ones job or job experience.
Employees perception of how well the job provides
those things (some things) that are important. These
things are the dimensions of job satisfaction.
11
Dimensions of job satisfaction
Workers looking for many things. A thing
may be taken as a dimension.
Things that are important for employees: (Give
rationale for each)
The work itself.
Pay/fringe benefits.
Promotion opportunities.
Supervision.
Coworkers.
Working conditions. 12
Elements of dimensions
Breaking each dimension further into actual patterns
of behavior that would be exhibited
Work itself: Elements opportunities for
advancement, sense of accomplishment, challenging
work/routine work.
Pay/fringe benefits: Elements Pay according to
qualifications, comparison with other organizations,
increments, availability of bonuses, old age
benefits, insurance benefits, other allowances.
13
Elements (cont.)
Promotion opportunities: Elements Mobility
policy, equitability, dead end job.
Supervision: Elements Employee centered,
employee participation in decision making.
Coworkers: Elements Primary group relations,
supportive attitude, cohesiveness
Working conditions: Elements Lighting,
temperature, cleanliness, building security, hygienic
conditions, utilities.
14
From elements to
questions/statements
On each element ask question (s), make
statements.

The responses can be put on a scale


indicating from high satisfaction to least
satisfaction

Example.
15
STATEMENTS
No. Statements S. Agree Undecided Disagree S. Disagree
Agree
1 I have a good opportunity for
advancement in my job
2 I feel very comfortable with
my co-workers

3 My pay is adequate to meet


my necessary expenses
4 My work gives me a sense
of accomplishment
5 My boss is impolite and cold
6 My job is a dead-end job
7 The company of my co-
workers is boring
8 Pay at my level is less as
compared to other
organizations 16
No. Statements S. Agree Agree Undecided Disagree S. Disagree

9 Most of the time I am


frustrated with my work
10 My boss praises good work
and is supportive
11 There is a chance of
frequent promotions in my
job
12 My co-workers are a source
of inspiration for me
13 I receive reasonable annual
increments
14 My work is very challenging
to me
15 My boss is adept in his work

16 We have an unfair promotion


policy in our organization
17
No. Statements S. Agree Agree Undecided Disagree S .Disagree

17 Working style of my co-


workers is different from
mine
18 The old-age benefits are
quite adequate
19 Most of the time I do
routine work
20 My boss does not delegate
powers
21 Opportunity for promotion
is some-what limited here
22 My co-workers try to take
credit of my work
23 My pay is commensurate
with my qualification

18
Home Work: (Uma Sekaran)
19
Operationalizing the Concept of LEARNING

20
Measurement Scales
Measurement scales are used to measure different
variables.
These Scales determine which statistical techniques
are appropriate to analyze your data.
knowing the level of measurement helps you decide
how to interpret the data from that variable.

Four types/levels of scales are used in research,


each have specific applications and properties.
The scales are nominal, ordinal, interval, and
ratio. 21
Nominal Scale (also denoted as categorical)
Nominal scales are used to classify objects, individuals,
groups, or even phenomena.
Nominal scale is always used for obtaining categorical data
such as gender or department in which one works, where
grouping of individuals or objects is useful. E.g.
1. Your gender 2. Your department
___Male ___Production

___Female ___Sales
___Accounting
___Finance
___Personnel
___R & D
___Other (specify)

22
Nominal scales are mutually exclusive (meaning
that items being classified will fit into one
classification).
These scales are also collectively exhaustive,
meaning that every element being classified can fit
into the scale.

Permitted statistics; frequencies (% and counts),


modes and chi squares (neither the mean nor the median can
be defined).
Nominal scales are the least powerful of the four
scales.
23
Ordinal Scale
Ordinal scales include the characteristics of the nominal
scale plus an indicator of order (ranking).
This type of scale can provide information about some item
having more or less of an attribute than others, but no
information on the degree of this.
The use of ordinal scale implies a statement of greater
than or less than without stating how much greater or
less. Other descriptors can be superior to, happier than,
poorer than, or above.
Ordinal Scale only measures order and does not indicate
objective distance between any two the relative positional
distances.
24
Permitted statistics: Frequencies, median, mode,
rank order correlation, non-parametric analysis of
variance but the mean cannot be defined.
Modelling techniques can also be used with ordinal
data.

25
Interval Scale
Interval scales have the power of nominal and
ordinal scales plus one additional strength:
magnitude of ranking. Interval scales have
equal distances between the points of a scale.
However, an interval scale does not have to
have a true zero point even if one of the scaled
values happens to carry the name "zero".
Good examples of interval scales are the
Fahrenheit and Celsius temperature scales. A
temperature of "zero" does not mean that there
is no temperature...it is just an arbitrary zero
point.
26
Interval scale not only groups individuals according to certain
categories and taps the order of these groups but also but also
magnitude of the differences among individuals. E.g. weight of
people, length of something, number of children in a family
(zero children means you have no children)
Permitted statistics; mean, median, mode, standard deviation,
Correlation r, Regression, Analysis of variance, Factor analysis plus
a whole range of advanced multivariate and modelling techniques.

27
We cannot calculate the ratios from interval
scales
For example, the elapsed time between 3 and 6 A.
M. equals the time between 4 and 7 A. M. One
cannot say, however, 6 A.M. is twice as late as 3
A.M. because zero time is an arbitrary origin.
In the consumer price index, if the base year is
1983, the price level during 1983 will be set
arbitrarily as 100. Although this is an equal
interval measurement scale, the zero point is
arbitrary.

28
Ratio Scale
The most comprehensive scale having all of the
characteristics of the other three with the additional benefit
of an absolute, meaningful zero point. E.g. Weight, Sales
volume, Income, area etc.
Ratio scales are usually used in organization research when
exact numbers on objective (as opposed to subjective)
factors are called for, as in the following question:
How many other organizations did you work for before joining this
system?
Please indicate the number of children you have in each of the
following categories?
---- below 3 years
---- between 3 and 6
---- over 6 years but under 12
---- 12 years and over
29
How many retail outlets do you operate?
Ratio scales are similar to interval scales. A ratio scale allows
you to compare differences between numbers. For example, if
you measured the time it takes 3 people to run a race, their times
may be 10 seconds (Racer A), 15 seconds (Racer B) and 20
seconds (Racer C). You can say with accuracy, that it took Racer
C twice as long as Racer A. Unlike the interval scale, the ratio
scale has a true zero value.
All statistics permitted for interval scales plus the following:
geometric mean, harmonic mean, coefficient of variation,
logarithms

30
The best way to contrast interval and ratio scales
is to look at temperature. The Centigrade scale
has a zero point but it is an arbitrary one. The
Farenheit scale has its equivalent point at -32o.
(Physicists would probably argue that Absolute
Zero is the zero point for temperature but this is a
theoretical concept.) So, even though temperture
looks as if it would be a ratio scale it is an
interval scale. Currently, we cannot talk about no
temperature - and this would be needed if it were
a ration scale.
31
32
Criteria of Good Measurement
After Conceptualization & Operationalization, it is
important to make sure that the developed
instrument to measure a particular concept is indeed
accurately measuring the variable.
A good measurement ensures that there is no
missing dimension, element, and question plus there
is nothing irrelevant .
Characteristics of a good measurement: Validity,
Reliability, and Sensitivity.
33
Reliability
Reliability is the consistency of your measurement,
or the degree to which an instrument measures the
same way each time it is used under the same
condition with the same subjects. (error free)
A reliable cars is one that starts every time we need it.
If you weigh five pounds of potatoes in the morning, and
the scale is reliable, the same scale should register five
pounds for the potatoes an hour later.
Waves Reliable hay
34
Forms of Reliability
Test-retest Reliability: Test-retest method of
determining reliability involves administering the
same test to the same respondents at two separate
times. If the result is same after intervals then
instrument is said to have test-retest reliability.

Use instrument for measuring job satisfaction at T-1.


Satisfied 64%. Repeat after 4 weeks. Same results.
Hence stability.
35
Two problems with test-retest
It is a longitudinal approach. So:
i. It may sensitize the respondents.
ii. Time may help change the attitude. Also
maturation of the subjects.
Hence the results may not show high
correlation.
Due to time factor rather than the lack of
reliability

36
Equivalent/Parallel Form Reliability

This approach attempts to overcome some of the


problems associated with the test-retest
measurement of reliability.
Two questionnaires, designed to measure the
same thing, are administered to the same group.
Both Questionnaires have similar items and same
response format, the only changes being the
wording and the order or sequence of the
questions.
37
Split-Half Reliability
In split-half reliability we randomly divide all items that
purport to measure the same construct into two sets.

38
Inter-Rater or Inter-Observer
Reliability
Used to assess the degree to which different
raters/observers give consistent estimates of
the same phenomenon.

39
Internal Consistency Reliability
This form of reliability is used to judge the consistency
of results across items on the same test.
When asking questions in research, the purpose is to
assess the response against a given construct or idea.
Different questions that test the same construct should
give consistent results. . When you see a question that
seems very similar to another test question, it may
indicate that the two questions are being used to gauge
reliability. Because the two questions are similar and
designed to measure the same thing, the test taker should
answer both questions the same, which would indicate
that the test has internal consistency.
40
Warning..!
Reliability is necessary but not sufficient condition
to test the goodness of a measure.
A measure could be highly stable and consistent,
but may not be valid.
Validity ensures the ability of the instrument to
measure the intended concept.
A reliable but invalid instrument will yield
consistently inaccurate results.

41
Validity
The ability of a scale to measure what was intended
to be measured. Addresses the issue of whether what
we tried to measure was actually measured.
Validity refers to the degree to which a study
accurately reflects or assesses the specific concept
that the researcher is attempting to measure. While
reliability is concerned with the accuracy of the
actual measuring instrument or procedure, validity is
concerned with the study's success at measuring
what the researchers set out to measure.

http://www.socialresearchmethods.net/kb/introval.php 42
Reliability and Validity on Target

Old Rifle New Rifle New Rifle


Sunglare
Low Reliability High Reliability Reliable but Not
Valid
(Target A) (Target B) (Target C)
Which one is Reliable & Valid ?

44
The figure above shows four possible situations. In the first one, you are hitting the
target consistently, but you are missing the center of the target. That is, you are
consistently and systematically measuring the wrong value for all respondents. This
measure is reliable, but no valid (that is, it's consistent but wrong). The second,
shows hits that are randomly spread across the target. You seldom hit the center of
the target but, on average, you are getting the right answer for the group (but not
very well for individuals). In this case, you get a valid group estimate, but you are
inconsistent. Here, you can clearly see that reliability is directly related to the
variability of your measure. The third scenario shows a case where your hits are
spread across the target and you are consistently missing the center. Your measure in
this case is neither reliable nor valid. Finally, we see the "Robin Hood" scenario --
you consistently hit the center of the target. Your measure is both reliable and valid
(I bet you never thought of Robin Hood in those terms before).
http://www.socialresearchmethods.net/kb/relandval.php 45
Forms of Validity
Content Validity- Refers to the extent to which the content of a
measurement instrument's represents the entire body of content
to be measured.
E.g. : Do the questions on an exam accurately reflect what you have
learned in the course, or were the exam questions sampled from only a
sub-section of the material? A test to measure your knowledge of
mathematics should not be limited to addition problems, nor should it
include questions about French literature.
Face validity is considered as a basic and very minimum index of
content validity. It is the validity of a test at face value. A test can be
said to have face validity if it "looks like" it is going to measure what
it is supposed to measure. For instance, if you prepare a test to
measure whether students can perform multiplication, and the people
you show it to all agree that it looks like a good test of multiplication
ability, you have shown the face validity of your test.
46
Criterion related validity
also referred to as instrumental validity, is used to demonstrate
the accuracy of a measure or procedure by comparing it with
another measure or procedure which has been demonstrated to
be valid. There are two subtypes of this kind of validity.
Concurrent validity: To have concurrent validity, an indicator
must be associated with a preexisting indicator that is judged to
be valid. For example we create a new test to measure
intelligence. For it to be concurrently valid, it should be highly
associated with existing IQ tests (assuming the same definition
of intelligence is used).
Predictive validity: Criterion validity whereby an indicator
predicts future events that are logically related to a construct is
called a predictive validity. Examples of test with predictive
validity are career or aptitude tests, which are helpful in
determining who is likely to succeed or fail in certain subjects or
occupations.
47
Construct Validity
Construct validity seeks agreement between a theoretical
concept and a specific measuring device or procedure. For
example, a researcher inventing a new IQ test might spend a
great deal of time attempting to "define" intelligence in
order to reach an acceptable level of construct validity.
Construct validity can be broken down into two sub-
categories: Convergent validity & Discriminate validity.
Convergent validity is the actual general agreement among
ratings, gathered independently of one another, where
measures should be theoretically related. Discriminate
validity is the lack of a relationship among measures which
theoretically should not be related.
48
Sensitivity
The sensitivity of a scale is an important measurement
concept. Sensitivity refers to an instruments ability to
accurately measure variability in stimuli or responses.
E.g. A dichotomous response category, such as agree or disagree,
does not allow the recording of subtle attitude changes. A more
sensitive measure, with numerous items on the scale, may be needed.
For example adding strongly agree, mildly agree, neither agree
nor disagree, mildly disagree, and strongly disagree as categories
increases a scales sensitivity.
The sensitivity of a scale based on a single question or single item can
also be increased by adding additional questions or items.
Practicality: The scientific requirements of a project call for the
measurement process to be reliable and valid, while the
operational requirements call for it to be practical. Practicality
has been defined as economy, convenience, and interpretability.

49
Here, we set up a 2x2 table. The columns of the table
indicate whether you are trying to measure the same or
different concepts. The rows show whether you are using
the same or different methods of measurement. Imagine that
we have two concepts we would like to measure, student
verbal and math ability. Furthermore, imagine that we can
measure each of these in two ways. First, we can use a
written, paper-and-pencil exam (very much like the SAT or
GRE exams). Second, we can ask the student's classroom
teacher to give us a rating of the student's ability based on
their own classroom observation.

50
The first cell on the upper left shows the comparison of the verbal written test score with the
verbal written test score. But how can we compare the same measure with itself? We could do
this by estimating the reliability of the written test through a test-retest correlation, parallel
forms, or an internal consistency measure (See Types of Reliability). What we are estimating
in this cell is the reliability of the measure.
The cell on the lower left shows a comparison of the verbal written measure with the verbal
teacher observation rating. Because we are trying to measure the same concept, we are
looking at convergent validity (See Measurement Validity Types).
The cell on the upper right shows the comparison of the verbal written exam with the math
written exam. Here, we are comparing two different concepts (verbal versus math) and so we
would expect the relationship to be lower than a comparison of the same concept with itself
(e.g., verbal versus verbal or math versus math). Thus, we are trying to discriminate between
two concepts and we would consider this discriminant validity.
Finally, we have the cell on the lower right. Here, we are comparing the verbal written exam
with the math teacher observation rating. Like the cell on the upper right, we are also trying to
compare two different concepts (verbal versus math) and so this is a discriminant validity
estimate. But here, we are also trying to compare two different methods of measurement
(written exam versus teacher observation rating). So, we'll call this very discriminant to
indicate that we would expect the relationship in this cell to be even lower than in the one
above it.

51
Ethical Issues in Research

Ethics are norms or standards of behavior that


guide moral choices about our behavior and
our relationships with others. The goal of
ethics in research is to ensure that no one is
harmed or suffers adverse consequences from
research activities.

52
Major Sources for Creating Ethical Dilemmas in
Research Practices are From Interactions Among:

Researcher
Client - Sponsor Researchers
Decision Makers Research
Sponsoring Clients Organizations
Management Teams

Respondents
Subjects
Objects of
Investigation

53
Unethical activities
Violating nondisclosure agreements.
Breaking respondent confidentiality.
Misrepresenting results.
Deceiving people.
Invoicing irregularities.
Avoiding legal liability.
Espionage or spying
Deception: Deception occurs when the
respondents are told only part of the truth or when
the truth is fully compromised.
54
Voluntary participation
Informed Consent this means that prospective
research participants must be fully informed about
the procedures and risks involved in research and
must give their consent to participate.
Confidentiality- they are assured that identifying
information will not be made available to anyone
who is not directly involved in the study.
anonymity which essentially means that the
participant will remain anonymous throughout the
study
55
Safety: It is the researchers responsibility to
design a project so the safety of all
interviewers, surveyors, experimenters, or
observers is protected. Several factors may be
important to consider in ensuring a
researchers right to safety.

56
Should Students change their choices
while attempting Objective type
questions?
The theory that a student should trust their first instinct and stay
with their initial answer on a multiple choice test is a myth.
Researchers have found that although people often believe that
changing answers is bad, it generally results in a higher test
score. The data across twenty separate studies indicate that
the percentage of "right to wrong" changes is 20.2%, whereas
the percentage of "wrong to right" changes is 57.8%, nearly
triple.Changing from "right to wrong" may be more painful and
memorable (Von Restorff effect), but it is probably a good idea
to change an answer after additional reflection indicates that a
better choice could be made.
57
Results obtained from computerized vocabulary tests were
compared under conditions in which item review was permitted
and not permitted. Comparisons between the review and no-
review conditions yielded no statistically significant differences in
number correct scores, ability estimates, or measurement error,
but examinees in both conditions strongly desired review
opportunities. Comparisons of answers before and after review
within the review condition showed that only a small percentage
of answers was changed (3.63%), that more answers were
changed from wrong to right than from right to wrong (by a ratio
of 2.25 to 1), that a large proportion of examinees (45%) changed
answers to at least some questions, and that most examinees
who changed answers improved their performance by doing so
(by a ratio of 2.44 to 1). The results revealed that performance
gains after review were greater for examinees of higher ability
and that review was desired more by examinees with higher test
anxiety. The major drawback to allowing review was a 35%
increase in testing time. 58
A brief narrative description of the journal article, document, or
resource. In a 1977 review of the literature on test answer changing,
Mueller and Wasser (EJ 163 236) cited 17 studies and concluded that
students changing answers on objective tests gain more points than
they lost by so doing. Higher scoring students tend to gain more than
do the lower scoring students. Six additional studies not reported in
the Mueller and Wasser review have supported this conclusion; this
paper reviews the findings from these additional studies. Research
synthesized in both reviews surveyed these topics: (1) student opinion
about the value of changing answers; (2) methods of measuring
answer changes; (3) percent of students changing answers and percent
of answers changed; (4) ratio of gains to losses; (5) relation of
changes to achievement and aptitude; (6) number of changed answers;
(7) sex differences; and (8) item characteristics. Although cognitive
style was investigated and non-college subjects were used, studies in
the present update continue to replicate previous research. Further
research is recommended. (Author/CP)
59

You might also like