You are on page 1of 17

This article was downloaded by: [185.5.152.

206]
On: 11 November 2014, At: 14:36
Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The Journal of Economic Education


Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/vece20

The Effects of a Translation Bias on the


Scores for the Basic Economics Test
a

Jinsoo Hahn & Kyungho Jang


a

Gyeongin National University of Education , Korea

Inha University , Korea


Published online: 11 Apr 2012.

To cite this article: Jinsoo Hahn & Kyungho Jang (2012) The Effects of a Translation Bias on the
Scores for the Basic Economics Test , The Journal of Economic Education, 43:2, 133-148, DOI:
10.1080/00220485.2012.659641
To link to this article: http://dx.doi.org/10.1080/00220485.2012.659641

PLEASE SCROLL DOWN FOR ARTICLE


Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions

THE JOURNAL OF ECONOMIC EDUCATION, 43(2), 133148, 2012


C Taylor & Francis Group, LLC
Copyright 
ISSN: 0022-0485 print/2152-4068 online
DOI: 10.1080/00220485.2012.659641

RESEARCH IN ECONOMIC EDUCATION

Downloaded by [185.5.152.206] at 14:36 11 November 2014

The Effects of a Translation Bias on the Scores


for the Basic Economics Test
Jinsoo Hahn and Kyungho Jang

International comparisons of economic understanding generally require a translation of a standardized


test written in English into another language. Test results can differ based on how researchers
translate the English written exam into one in their own language. To confirm this hypothesis,
two differently translated versions of the Basic Economics Test (BET) (Walstad, Rebeck, and Butters
2010a) were given to elementary school students in Korea. We found the possibility of overestimating
or underestimating the levels of economic understanding by various sources of translation bias.
Therefore, it is important to carefully interpret the assessment results of international comparisons.
Our study is applicable not only to international comparisons of economic literacy but also to any
comparison of knowledge across cultures or languages within a country.
Keywords BET, economic understanding, translation bias
JEL codes A21, A22, I21

Studies comparing the levels of economic understanding of students in different countries have
grown over the last several decades (see Asano 2002; Whitehead and Halil 1991; Yamaoka et al.
2010, among others). Hahn (2002, 2006), K. Kim (1993, 1994), and S. Kim (2008), for example,
compared the levels of economic understanding between Korean students and the students of a
different country. These studies tend to utilize a standardized test developed in the United States.
For instance, the Test of Understanding in College Economics (TUCE) (Walstad, Watts, and
Rebeck 2007) is used to measure the level of economic understanding for college students. The
Jinsoo Hahn is a professor of social studies education at Gyeongin National University of Education, Korea (e-mail:
jshahn@gin.ac.kr). Kyungho Jang is an associate professor of social studies education at Inha University, Korea, and the
corresponding author (e-mail: kjang@inha.ac.kr).
The authors thank Kristin Chase and seminar participants at the 2011 National Conference on Teaching Economics at
Palo Alto, California, for their helpful comments. This article has benefitted from valuable comments by two anonymous
referees. We are responsible for any remaining errors and deficiencies in this article.
This article is based on an paper that was presented at the National Conference on Teaching Economics held at
Stanford University on June 13, 2011.

Downloaded by [185.5.152.206] at 14:36 11 November 2014

134

HAHN AND JANG

Test of Economic Literacy (TEL) (Walstad and Rebeck 2001) is used for high school students.
The Test of Economic Knowledge (TEK) (Walstad, Rebeck, and Butters 2010b) is used for middle
school students. The Basic Economics Test (BET) (Walstad, Rebeck, and Butters 2010a) is used
for elementary school students.
However, a problem may arise in translating an assessment tool into another language. Would
the translated version of a test be equivalent to the original version in terms of measuring the
level of economic understanding? Despite the effort to preserve the original purpose and form of
measurement as much as possible, there is always a possibility of unintentional differences due
to the translations.
The purpose of this study was to identify whether assessment results are actually altered based
on two different translations of the original test, to confirm how much the results are altered, and to
verify what sources provoke such problems. For this purpose, we used an American standardized
test, the BET (Walstad, Rebeck, and Butters 2010a), which measures the economic understanding
of elementary school students.
The equivalence of translations can be threatened by many sources. Each language has its own
linguistic structure, and cultural environments greatly differ from country to country and even
within countries. Since the ways in which researchers attempt to translate standardized tests may
differ, there is always a possibility of the translated versions becoming somewhat different from
the original test. Translators agonize over the choice of an equivalent term for a specific word
such as trade, which has many meanings including transaction, commerce, exchange, bartering,
international transaction, and business.
The issue regarding translation is not limited to the interpretation of words. For example, a
researcher majoring in economics could translate a phrase very awkwardly. Let us take English
and Korean as an example. The syntax or structure of the two languages differs greatly. Korean
tends to use sentences in active voice compared to English, in which passive voice is common.
Korean does not usually select objects as the subject of a sentence while English generally does.
However, if a researcher literally translates the passively written English tests word-for-word or
uses objects as subjects in an effort to preserve the originality of the test, students who take the
test would be faced with the difficulty of comprehending the test items. As a result, there is always
a possibility that the end results of the tests are biased.
Problems occur due to cultural differences as well. In the second edition of the BET (Walstad
and Robson 1990), the word hotdog appears in a question. This question measures students
understanding of hotdogs and hamburgers being substitutes, hotdogs and hotdog buns being
complements, and the students analysis of supply and demand. However, the image of a hotdog
to Korean students is somewhat different from that of American students. The hotdog that
Korean students think of is known as a corn dog in the United States. As a result, the meaning
of hotdog in the question is unfamiliar to most Korean elementary school students, and a hotdog
bun is obviously even more unfamiliar to them. If researchers disregard the cultural differences
and decide to merely translate hotdog as hotdog in order to preserve the originality of
the assessment exams, the percentage of correct answers would likely be lower among Korean
students compared to those of American students. It is possible to replace the words hotdog and
hamburger with popular foods among Koreans such as dried seaweed rolls and dumpling
in order to alleviate the distorted scores due to unfamiliar culture, as in the study by Hahn (2002).
However, it is hard to show that dried seaweed rolls and dumpling are adequate translations
for hotdog and hamburger.

Downloaded by [185.5.152.206] at 14:36 11 November 2014

EFFECTS OF TRANSLATION BIAS

135

As seen in the examples above, a possibility always exists for the homogeneity of the original
standardized test to be distorted due to various factors. In this study, we call this a translation bias.
This study shows that there are many types of bias in test translation and that the differing levels
of economic understanding between countries can be caused by a translation bias, resulting in
overestimation or underestimation of the difference. Translation bias is expected to occur more
commonly among younger students who are in a low developmental stage or who have incomplete
linguistic acquisition.
There is literature on the issue of translation bias regarding international assessments such as the
Program for International Student Assessment (PISA)1 and Trends in International Mathematics
and Science Study (TIMSS).2 Olsen, Turmo, and Lie (2001) showed that even small changes
in the item wording might have a substantial influence on the response pattern. Wuttke (2008)
argued that a cultural bias clearly favored English-speaking students. Our study contributes to the
existing literature by providing a useful taxonomy of possible sources of translation bias on the
response pattern as well as exam scores.
It would be desirable to perform pilot tests with different translations to see if there is a
translation bias and rewrite questions as necessary. The process is, however, complicated in that
a given translation yielding higher scores does not necessarily mean a better translation because
it might reveal too much of the answer in the question, relative to the English version.
Even though we focus on the translation bias for international comparisons of economic
understanding, the results of this study are applicable not only to international comparisons of
economic literacy, but also to any comparison of knowledge across cultures or languages.3 For
instance, we need to pay attention when comparing economic knowledge among students within
a single country if the students consist of native and nonnative speakers who have different levels
of vocabulary and different cultural backgrounds.

METHOD
Translation of the BET
The first edition of the BET (Chizmar, Halinski, and McCarney 1980, 1981) was developed by
the Council for Economic Education (CEE) (then the Joint Council on Economic Education) in
1980 as an assessment that measures the economic understanding of elementary school students.
Afterwards, it was modified into the second edition by Walstad and Robson (1990). Two decades
later, Walstad, Rebeck, and Butters (2010a) published the third edition of the BET. The third
edition of the BET is composed of form A and form B, each having 30 questions.
In this study, we selected 37 questions that researchers might translate into two different interpretations or with which the researchers might experience general difficulty in selecting the
appropriate translation.4 Then, we translated the questions into two different versions in Korean.5
The first version is the actual English version of the BET, which preserves the literal translation
of the original text.6 Meanwhile, the second version is somewhat differently translated in comparison to version 1. However, we tried not to translate the original text arbitrarily because the
excessively translated version might be a bit too different from version 1 so that the validity of
the end results of the study would be put into question. Instead, we tried to translate the questions
into the forms that we expect researchers might generally prefer through referring to the existing

136

HAHN AND JANG

TABLE 1
Categories of Questions
Categories

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Common
Abbreviation
Syntax
Culture
Vocabulary
Examples
Numbers

Questions

No. of questions

8, 15, 16
3, 7, 14, 21, 27, 28, 29, 31, 32, 34, 35
2, 6, 9, 10, 12, 20, 23, 30, 33, 36
13, 26
1, 5, 18, 22, 24, 25
4, 11, 17
19, 37

3
11
10
2
6
3
2

translated standardized tests in Korean. For instance, if there was an awkward expression in version 1, we paraphrased and expressed it in a more Korean way. Furthermore, for those questions
that held different cultural values, we replaced the terms with examples that have more familiarity
for Koreans.7
Table 1 presents the categories of the 37 questions for the test in this study. First, we included
3 questions in the Common category. These questions are the same for versions 1 and 2.
We included these questions to test the homogeneity of two test groups of Korean students.
Second, we chose 11 questions in the Abbreviation category. In most cases, version 2 is more
abbreviated than version 1. Third, we considered 10 questions in the Syntax category. These
questions represent grammatical differences between English and Korean. Fourth, we included 2
questions in the Culture category. These questions reflect the cultural differences between the
United States and Korea. Fifth, we selected 6 questions in the Vocabulary category. Versions
1 and 2 are slightly different regarding the vocabulary used for the test. Sixth, we included 3
questions in the Examples category. We tried to use examples that would be more familiar to
Korean elementary school students in version 2. Finally, we chose 2 questions in the Numbers
category. Versions 1 and 2 are slightly different regarding the scale of the numbers used for the
test.
It was expected that the scores on version 2 would be generally higher than those on version
1, but it may not always be the case. For instance, abbreviation might work in both directions:
It helps students find key concepts from a concise word or phrase, while it hinders students
from understanding the economic meaning of a question given the low level of recognition of
elementary school students. Therefore, we investigated the effects of translation bias on the score
of each question as well as the overall scores.
The Sample
Korean versions of the BET were given to elementary school students in grades 4 through 6, even
though the BET was developed and aimed at measuring the economic understanding of students
in grades 5 and 6. This is because economics content is included in the curriculum for students
in grade 4 in Korea and because students are expected to have learned the economic concepts
by the time they take the translated BET in December. Including 4th-graders does not affect our
main results because we focus on the translation bias comparing the test results within a country.
We composed the sample so as to have 5th-graders as the majority taking the test. Two classes

EFFECTS OF TRANSLATION BIAS

137

TABLE 2
Distributions of the Sample by Grade and Gender
4th grade

6th grade

Total

Version

Gender

No.

No.

No.

No.

Version 1

Male
Female
Subtotal
Male
Female
Subtotal

27
30
57
32
30
62

22.9
25.6
24.3
24.2
25.2
24.7

54
46
100
61
48
109

45.8
39.3
42.6
46.2
40.3
43.4

37
41
78
39
41
80

31.4
35.0
33.2
29.5
34.5
31.9

118
117
235
132
119
251

100.0
100.0
100.0
100.0
100.0
100.0

Version 2

Downloaded by [185.5.152.206] at 14:36 11 November 2014

5th grade

from each of the nine elementary schools, which accounted for 18 classes in all, participated in
this study.8 Version 1 was randomly distributed to one of the two classes of a school, as version
2 was distributed to the other of the two. Our sample consists of 235 students who took version
1 and 251 students who took version 2.9 Table 2 shows the composition of the students by grade
and gender.
Because students are randomly assigned to their classes in Korean elementary schools, students
in the two groups are likely to be homogeneous. In order to statistically confirm the homogeneity
between the two groups of students that took version 1 or 2, we included three questions of
the same translation in both versions 1 and 2. If there is no significant difference in terms of the
distribution of the selected answers as well as the percentage of the correct answers for these three
questions, the two groups of students are considered to be statistically homogeneous. Table 3
shows that the mean scores in the common categories are not significantly different.
The categories of Abbreviation, Syntax, Examples, and Numbers did not yield significantly different mean scores, but the categories of Culture and Vocabulary did with a
5-percent significance level. As a result, the two versions of the test did not yield significantly
different scores as a whole. These results might imply the existence of possible mixtures of
overestimates and underestimates of students economic understanding in each category.
TABLE 3
Overall Scores of Each Category
Version 1
Categories
Common
Abbreviation
Syntax
Culture
Vocabulary
Examples
Numbers
Overall
Note: p < .1; p < .05.

Version 2

SD

SD

t-test (p-value)

59.4
38.1
60.4
40.4
54.5
64.0
84.9
53.2

33.0
16.2
17.5
34.3
21.7
23.2
27.6
13.1

61.2
35.8
60.2
47.2
60.6
66.7
82.9
54.1

30.4
15.1
20.8
35.4
22.3
28.6
28.4
14.2

0.621 (0.535)
1.607 (0.109)
0.128 (0.898)
2.143 (0.033)
3.081 (0.002)
1.136 (0.256)
0.796 (0.426)
0.708 (0.479)

138

HAHN AND JANG

TABLE 4
Percentage Responses to Alternatives of the Questions in the Common Category

Version 1
Item
8
15
16

WMW Test
(p-value)

Version 2

Blank

Blank

Each

Joint

15.3
3.0
10.6

7.2
64.7
11.9

50.2
9.4
63.4

26.4
19.1
12.3

0.9
3.8
1.7

15.5
4.0
11.2

7.6
66.5
10.0

48.6
14.7
68.5

25.9
9.6
10.0

2.4
5.2
0.4

0.648
0.089
0.892

0.185

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

We expect that the translation bias will affect the distribution of students responses as well
as the scores on the examination because the existence of an attractive wrong answer plays an
important role in students responses. In this regard, we carried out the Wilcoxon-Mann-Whitney
test (WMW), which is a nonparametric statistical hypothesis test for assessing whether two
independent samples have equally large values.
Table 4 shows the percentage responses of alternatives of three questions in the Common
category. There are three questions in the category; question 8 is relevant to human capital,
question 15 measures whether students understand the concept of the law of supply, and question
16 measures whether they know the meaning of a monopoly. The joint WMW test, which consists
of three questions, cannot reject the null hypothesis of homogeneity with a 5-percent significance
level. This result is consistent with a t-test in table 3. The WMW test for each question cannot
reject the null hypothesis of homogeneity with a 5-percent significance level. These results
indicate that two independent groups are statistically homogeneous.
FACTORS OF TRANSLATION BIAS
In this section, we analyze the questions that showed a significant difference between the two
groups as a result of applying the two differently translated versions. We discuss potential factors
that might have led to the translation bias, using six categories: abbreviation, syntax, culture,
vocabulary, examples, and numbers.
Abbreviation
The percentage responses of alternatives for the questions in the Abbreviation category are
presented in table 5. First, the joint WMW test for 11 questions rejects the null hypothesis of
homogeneity with a 5-percent significance level. Second, the WMW test for each question rejects
the null hypothesis of homogeneity with a 5-percent significance level for questions 14 and
34. These results indicate that a translation bias created by the factor of abbreviation exists,
especially for questions 14 and 34.
Question 14 asks the function of price in the market. In version 1, the question is literally
translated, while we made some abbreviations in version 2. First, we thought that some words

EFFECTS OF TRANSLATION BIAS

139

TABLE 5
Percentage Responses to Alternatives of the Questions in the Abbreviation Category

Version 1

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Item
3
7
14
21
27
28
29
31
32
34
35

WMW Test
(p-value)

Version 2

Blank

Blank

Each

Joint

14.5
19.6
49.4
27.7
15.3
14.9
20.9
19.1
28.5
25.5
64.3

32.8
35.7
25.5
36.6
18.7
66.6
34.9
23.4
30.2
19.6
23.4

14.9
25.1
11.5
3.0
21.3
10.6
17.4
28.9
22.1
48.9
6.4

36.2
19.1
13.6
31.9
43.4
8.5
24.7
27.7
14.9
5.1
4.7

1.7
0.4
0.0
0.9
1.3
0.0
2.1
0.9
4.3
0.9
1.3

19.1
14.7
27.5
30.3
8.4
18.3
25.5
11.6
34.3
20.7
63.3

31.9
47.0
22.3
28.7
29.5
65.7
29.5
24.7
28.7
18.7
19.9

17.9
15.9
21.9
7.6
14.3
6.4
15.9
40.2
21.5
48.6
5.6

30.3
21.9
27.5
33.5
47.8
8.8
27.1
23.5
15.1
12.0
9.6

0.8
0.4
0.8
0.0
0.0
0.8
2.0
0.0
0.4
0.0
1.6

0.234
0.997
0.000
0.677
0.325
0.137
0.844
0.307
0.992
0.025
0.645

0.004

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

such as competitive in the question do not contribute to solving the questions. As a result,
the text was abbreviated as a whole in the process of translating. To be more specific, version 2
became more concise than version 1 because the expressions in version 2 were written in a form
more familiar to Koreans, while version 1 had many awkward phrases due to the word-for-word
translation. In addition, the amount sellers produce and the amount buyers want to buy in
version 1 was expressed concisely as the amount to sell and the amount to purchase in version
2. The multiple choices were the same for versions 1 and 2.10
The students showed a significant difference in the distributions of their answers, as shown in
table 5. The WMW test shows a significant difference between the two groups with a 5-percent
significance level. It is noteworthy that the percentage of correct answers for version 1 (49.4
percent) was close to twice the percentage of version 2 (27.5 percent). In addition, the given
answers for version 2 were more evenly distributed among the four multiple choices compared
to version 1. Although it is difficult to clearly identify the exact cause for such a difference, our
guess is that the percentage of correct answers was lower in version 2 because the elementary
school students had difficulty understanding the word purchase, which originally comes from
Chinese characters.
Question 34 asks about the relationships among income, expenditure, and savings. In version
1, the question is literally translated word-for-word. In version 2, we replaced not spent on goods
or services or paid to the government with left after expenditure. The multiple choices were
the same for versions 1 and 2.
The students showed a significant difference in the distributions of their answers with a 5percent significance level. The percentage of correct answers for version 2 (20.7 percent) was
slightly lower than for version 1 (25.5 percent) because the term expenditure in version 2
was unfamiliar to Korean elementary school students. We believe that the percentage of correct

140

HAHN AND JANG

Downloaded by [185.5.152.206] at 14:36 11 November 2014

answers would be higher if consumption was used instead of expenditure. For this particular
question, the percentage of students who selected c (profit) as their answer was over twice
the percentage of the students who selected the correct answer a (saving), implying that many
students confuse household income with a firms revenue.
Meanwhile, some questions did not show a significant difference in the distribution of the
selected answers, but did show a significant difference in the percentage of correct answers.
Among others, question 7 asks about the definition of economic incentives. Since the literal
translation led to awkward sentences, we phrased the sentence into a more familiar form in
version 2 while translating the multiple choices identically. As a result, the percentage of correct
answers for version 2 (47.0 percent) was much higher than that for version 1 (35.7 percent).11
Differences in Syntax
The grammatical structure of sentences and phrases is different between English and Korean. For
example, the order of the words is subject, verb, object (SVO) in English while it is subject,
object, verb (SOV) in Korean. For instance, I love you in English is expressed as I you love
in Korean. In addition, in English, adverbial phrases and clauses are often used in the front of the
sentence for the purpose of emphasis, while, in Korean, an adverbial phrase is placed where it
qualifies the verb. Furthermore, Korean prefers active sentences to passive sentences compared
to English. With such an issue in mind, we decided to find out whether the grammatical structure
of a sentence affects the outcome of the students selection of answers.
The percentage responses of alternatives for the questions in the Syntax category are presented in table 6. The joint WMW test for 10 questions rejects the null hypothesis of homogeneity
with a 5-percent significance level. The WMW test for each question rejects the null hypothesis
of homogeneity with a 5-percent significance level for questions 23, 30, and 36, while it rejects
TABLE 6
Percentage Responses to Alternatives of the Questions in the Syntax Category

Version 1
Item
2
6
9
10
12
20
23
30
33
36

WMW Test
(p-value)

Version 2

Blank

Blank

Each

83.0
18.7
11.1
7.2
7.7
3.0
34.0
12.3
40.0
16.2

12.3
8.5
63.4
21.3
18.7
22.1
38.3
61.7
22.6
10.2

3.4
56.6
7.2
68.5
13.6
5.1
21.7
17.9
15.7
6.8

0.9
15.3
17.9
3.0
58.3
69.8
5.5
6.0
19.1
64.3

0.4
0.9
0.4
0.0
1.7
0.0
0.4
2.1
2.6
2.6

82.1
15.1
6.8
17.5
10.0
4.4
2.8
20.3
50.6
6.4

12.4
10.4
62.5
14.3
20.7
21.9
47.8
35.5
20.3
13.1

4.8
59.0
9.2
64.9
8.4
4.0
26.3
31.1
13.5
7.6

0.4
14.7
21.5
3.2
58.2
69.7
22.3
13.1
15.1
71.3

0.4
0.8
0.0
0.0
2.8
0.0
0.8
0.0
0.4
1.6

0.764
0.709
0.063
0.185
0.545
0.864
0.000
0.004
0.100
0.031

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

Joint

0.003

Downloaded by [185.5.152.206] at 14:36 11 November 2014

EFFECTS OF TRANSLATION BIAS

141

the null hypothesis with a 10-percent significance level for questions 9 and 33. These results
indicate that a translation bias created by the factor of differences in syntax exists.
Question 23 asks about the relationship between jobs and income. In version 1, the multiple
choices were literally translated. In version 2, they were translated into a version similar to the
Korean type. First, we thought that the sentence They have a job title would be misleading in
Korean if it was literally translated word-for-word and decided to translate it into Their job title
is impressive in version 2. Second, passive sentences were translated into active sentences for
other choices in version 2.
The students showed a significant difference in the distributions of their answers. The WMW
test shows a significant difference between the two groups with a 5-percent significance level. In
version 2, we used many idiomatic expressions familiar to Korean students so that the percentage
of correct answers for version 2 (47.8 percent) was significantly higher than the percentage for
version 1 (38.3 percent). It would have been easier for students to find the correct answer in
version 2 since other choices are expressed in more familiar active sentences. It is noteworthy that
the percentage of the incorrect answer a for version 1 (34.0 percent) was much higher than the
percentage for version 2 (2.8 percent). It seems to us that the word-for-word translation of They
have a job title misled students to consider those jobs as the so-called famous jobs in Korea,
such as doctors, lawyers, and government officers. On the other hand, in version 2, students were
easily able to judge that choice a was incorrect because of clarified meanings provided.
Question 30 asks about the definition of an unemployed person. The expression To be counted
as unemployed is placed in the front of the sentence in the original English text of the assessment
exam. In version 1, it was also put in the front along with the other phrases translated word-forword. In version 2, the subject was placed in the front while the particular expression was placed
in the middle. In addition, the sentence was translated into a form for multiple-choice questions
widely used in Korea.
The students showed a significant difference in the distributions of their answers. The WMW
test shows a significant difference between the two groups with a 5-percent significance level.
It is noteworthy that the percentage of correct answers for version 2 (35.5 percent) was much
lower than the percentage for version 1 (61.7 percent), which contrasts with our expectation. We
found after consulting several elementary school teachers that the translation in version 2 misled
students to consider question 30 as a question for finding the cause of unemployment rather than
the condition to be classified as the unemployed, which caused many students to choose the
wrong answer c (has few job skills) in version 2.
Question 33 is about the meaning and the effects of competition. This time, we translated the
question and the multiple choices differently in versions 1 and 2. Since there is only a subject in
version 1, the student has to check all of the given choices in order to specifically understand what
the problem is demanding. Although such a form can be commonly found in many American
multiple-choice questions, Korean students are not familiar with the form. Thus, we translated
the question into a more Korean-friendly version by adding some complementary parts at the
end of the question in version 2. In addition, we dropped the term help in version 2 because
causative verbs do not exist in Korean. As a result, the multiple choices for version 2 became
much simpler and clearer than those for version 1.
The students showed a significant difference in the distributions of their answers, as shown in
table 6. The WMW test shows a significant difference between the two groups with a 10-percent
significance level. The percentage of correct answers for version 2 (50.6 percent) was higher than

Downloaded by [185.5.152.206] at 14:36 11 November 2014

142

HAHN AND JANG

the percentage for version 1 (40.0 percent). We see this as a consequence of the students being
able to easily solve the question in version 2, as the multiple choices were translated into a form
that was more appropriate and familiar to Korean students.
Question 36 inquires about the relationship between jobs and income. As in the case of question
23, the questions were translated identically, while the given multiple answers were translated
differently. The multiple choices in version 1 were translated literally, while the choices in version
2 were translated into a more familiar version for Korean test takers. For instance, we dropped
the term supply for choice a, replaced have spent fewer years in school with have limited
education for choice c, and replaced produce more than unskilled workers with have higher
productivity for choice d.
Table 6 shows a significant difference in the distributions of their answers with a 5-percent
significance level. The percentage of correct answers for version 2 (71.3 percent) was higher than
the percentage for version 1 (64.3 percent). It seems that dropping supply helped students who
are ignorant of the term supply not to choose a. In fact, the percentage of the wrong answer
a in version 2 was much lower than that in version 1. We also believe that Korean idiomatic
expressions in version 2 helped students find the correct answer.
Cultural Differences
When researchers translate a standardized test, they must decide whether to use the examples or
cases as given in the test. Some examples or cases in the questions may be familiar to American
students, but unfamiliar to Korean students due to cultural differences. If these examples or cases
are used as they are, some Korean students may have difficulty in finding correct answers due to
the unfamiliar cultural environment rather than the lack of economic understanding.
Table 7 presents the percentage responses of alternatives for the questions in the Culture
category. The joint WMW test for two questions cannot reject the null hypothesis of homogeneity
with a 5-percent significance level, but the test for each question rejects the null hypothesis of
homogeneity with a 5-percent significance level for question 26. These results indicate that a
translation bias is created by cultural differences.
This question asks about the investment in human capital using the example of babysitting.
Babysitters are common in the United States but not in Korea. Thus, in contrast to the American
students, Korean students will primarily experience difficulty when solving this question since

TABLE 7
Percentage Responses to Alternatives of the Questions in the Culture Category

Version 1
Item
13
26

WMW Test
(p-value)

Version 2

Blank

Blank

Each

Joint

37.0
9.4

16.6
37.0

25.5
43.8

18.7
8.5

2.1
1.3

35.9
6.8

16.3
29.1

26.7
58.6

19.5
5.6

1.6
0.0

0.648
0.037

0.132

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

EFFECTS OF TRANSLATION BIAS

143

they are not familiar with such a term. In version 2, we replaced babysitter with baker, a job
familiar to Korean students while preserving the economic concepts in the multiple choices.
The WMW test shows a significant difference between the two groups with a 5-percent
significance level. The percentage of correct answers for version 2 (58.6 percent) was much higher
than the percentage for version 1 (43.8 percent). This result implies that the cultural differences can
affect the score of an examination and yield biased estimates of students economic understanding
if researchers are too loyal to the original text, ignoring cultural differences.

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Differences in Vocabulary
Some students might experience difficulty in solving a question when they are unable to catch
the meaning of the vocabulary in the test even though they understand the economic concept that
the test intends to ask. One of the sources for the differences in vocabulary is the difference in
educational curricula between two countries. Some vocabulary is included in school textbooks in
the United States, but not in Korea. However, we should not underestimate students economic
understanding of one country just because they are less familiar with a term compared to students
of another country.
The percentage responses of alternatives for the questions in the Vocabulary category are
presented in table 8. First, the joint WMW test for six questions rejects the null hypothesis of
homogeneity with a 5-percent significance level. Second, the WMW test for each question rejects
the null hypothesis of homogeneity with a 5-percent significance level for question 5.
Before considering question 5, let us take a look at question 1 in which the percentage of
correct answers is surprisingly lower than in the United States. Question 1 is about goods and
services. The correct answer for this question is that the act of getting a hair cut at a salon is a
service, while shampoo is a good. In order to check potential translation bias from the differences
in vocabulary, we replaced haircut and shampoo with medical treatment and medicine,
respectively, in version 2.
As seen in table 8, the WMW test does not show a significant difference between the two
groups with a 5-percent significance level. The percentage of correct answers for version 2 (29.1
TABLE 8
Percentage Responses to Alternatives of the Questions in the Vocabulary Category

Version 1
Item
1
5
18
22
24
25

WMW Test
(p-value)

Version 2

Blank

Blank

Each

46.0
7.2
9.4
2.6
17.0
3.0

2.6
3.0
13.2
7.2
49.4
17.4

23.0
52.3
8.5
12.8
20.0
53.6

27.7
36.6
67.7
76.2
13.6
25.5

0.9
0.9
1.3
1.3
0.0
0.4

53.8
2.0
9.6
8.0
15.9
3.2

6.4
2.8
13.9
9.2
39.8
11.6

8.8
91.6
6.4
9.6
30.3
60.6

29.1
3.6
69.3
73.3
13.1
24.7

2.0
0.0
0.8
0.0
0.8
0.0

0.124
0.000
0.767
0.303
0.175
0.423

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

Joint

0.048

Downloaded by [185.5.152.206] at 14:36 11 November 2014

144

HAHN AND JANG

percent) was slightly higher than the percentage for version 1 (27.7 percent), but the percentage
of correct answers is still lower than that of U.S. students (69 percent). Korean students have
difficulty in distinguishing services from goods because they are not included in the curriculum
for elementary school in Korea. Furthermore, the term good is unfamiliar to Korean students
because it is only used by economists. Even adults seldom use the term in their ordinary daily
lives. As a result, most students considered both haircut/treatment and shampoo/medicine as
goods (46.0 percent in version 1 and 53.8 percent in version 2).
The same issue applies for the term tradeoff in question 5. This item investigates rational
decision making in a situation where a person wishes to buy five pairs of jeans but has to give
up some of them because he/she wants to buy a computer too. The question and other multiple
choices were identically translated, while only the correct answer was translated in two different
ways. The term tradeoff was literally translated in version 1, whereas the meaning of the term
was provided through description in version 2.
The reason for such an attempt is as follows. We see that U.S. students might have a comparative
advantage in terms of solving the problem because they already know the dictionary definition
of the term tradeoff even if they do not understand the economic meaning. On the other hand,
the Korean term tradeoff is not a word commonly used in daily lives. It is especially difficult
for elementary school students to understand the meaning of the term because the academic
term is first introduced in high school textbooks in Korea. Thus, we described the meaning of
tradeoff in version 2 in order to help students understand its definition.
As a result, the students showed a significant difference in the distributions of their answers,
as shown in table 8. The WMW test shows a significant difference between the two groups with
a 5-percent significance level. The percentage of correct answers for version 2 (91.6 percent)
was close to twice the percentage for version 1 (52.3 percent). It is noteworthy that researchers
might fail to preserve the level of difficulty if they provide too much beneficial information in
terms of finding the right choice when translating the correct answer of a question in which other
multiple choices are not attractive as an incorrect answer. The result also implies that a significant
difference exists in the students selection of choices based on the description of terms.
Meanwhile, questions 24 and 25 did not show a significant difference in the distribution of
the selected answers, but did show a significant difference in the percentage of correct answers
between versions 1 and 2. Question 24 is about the meaning of entrepreneur. The English
term entrepreneur is often translated as businessman in Korean. If the term entrepreneur is
translated as businessman word-for-word, it will provide an unintentional hint as to the correct
answer, which contains the term business. In order to maintain the consistency of making
version 1 a word-for-word translation, we translated entrepreneur as businessman in version
1 and enterpriser in version 2, while leaving the multiple choices as they are. As a result, the
percentage of correct answers for version 1 (49.4 percent) was much higher than the percentage
for version 2 (39.8 percent). In fact, the percentage of the correct answer for version 1 is close to
that in the United States (52 percent). This implies that researchers might overestimate Korean
students understanding of an entrepreneur if they translate the term word-for-word.
Question 25 asks the meaning of profit. While the English term profit means gain, benefit,
or business profit in Korean, an individuals profit is often translated as his/her gain in
Korean. Likewise, the English term revenue means revenue or sales in Korean. In order to
maintain the consistency of making version 1 a word-for-word translation, we translated profit
and revenue as gain and revenue in version 1, while replacing these terms with business

EFFECTS OF TRANSLATION BIAS

145

profit and sales, respectively, in version 2. As a result, the percentage of correct answers
for version 2 (60.6 percent) was much higher than the percentage for version 1 (53.6 percent).
However, the percentage of correct answers is much lower than that in the United States (76
percent) regardless of the type of translation. The difference of economic understanding between
the two countries is subject to the method of translation.12

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Familiarity of Examples
Some questions in the BET use specific examples to measure students economic understanding.
In order to find out the potential translation bias from the familiarity of examples, we translated
two different versions of a test. Table 9 shows the percentage responses of alternatives for the
questions in the Examples category. First, the joint WMW test for three questions rejects the
null hypothesis of homogeneity with a 5-percent significance level. Second, the WMW test for
each question rejects the null hypothesis of homogeneity with a 5-percent significance level for
question 4 and with 10 percent for question 11.
Question 4 is about an example of a capital good. In version 2, we replaced truck driver and
cement truck with worker and factory machine, respectively, while leaving the question
and other choices as they are. In other words, we replaced the specific examples in version 1 with
ones that would be more familiar to Korean students in version 2.
The students showed a significant difference in the distributions of their answers with a 5percent significance level. The percentage of correct answers for version 2 (41.0 percent) was
close to twice the percentage for version 1 (24.7 percent). This result implies that the familiarity
of the examples can affect the score of the students economic understanding.
Scale of Numbers
Currency differs across countries so that U.S. dollars or a nations own currency can be used
for the translated tests, which means that the scale of numbers may differ depending on how
researchers treat their currency. The U.S. dollar is usually converted to 1,000 Korean won in
order to avoid complicated calculations (the average exchange rate was 1,150 Korean won in
2010). As the digits of numbers become larger in the Korean version, there is a possibility of
hindering the students from focusing on the economic concepts in the test. In other words, U.S.
TABLE 9
Percentage Responses to Alternatives of the Questions in the Examples Category

Version 1
Item
4
11
17

WMW Test
(p-value)

Version 2

Blank

Blank

Each

Joint

22.1
86.4
80.9

35.7
8.1
8.9

14.9
0.9
8.1

24.7
4.7
1.7

2.6
0.0
0.4

6.4
80.9
78.1

19.5
9.6
6.0

32.3
4.4
10.4

41.0
5.2
5.2

0.8
0.0
0.4

0.000
0.099
0.299

0.000

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

146

HAHN AND JANG

TABLE 10
Percentage Responses to Alternatives of the Questions in the Numbers Category

Version 1
Item
19
37

WMW Test
(p-value)

Version 2

Blank

Blank

Each

Joint

92.3
6.4

1.7
8.9

2.1
6.4

3.8
77.4

0.0
0.9

91.2
8.8

1.2
11.6

4.8
5.2

2.8
74.5

0.0
0.0

0.678
0.404

0.783

Downloaded by [185.5.152.206] at 14:36 11 November 2014

Notes: The shaded cells denote the correct responses. Boldface denotes the highest proportion of response.
p < .1; p < .05.

students only have to subtract 10 from 20, while Korean students have to subtract 10,000 from
20,000.
In order to check the effects of the scale of number, we translated $10 as 10,000 Korean
won or 10K Korean won. However, contrary to our expectations, we could not find a significant
difference in the distribution of the answers or in the percentage of correct answers for questions
19 and 37, as shown in table 10. It seems that the difference between a double-digit number and
a five-digit number does not lead to a particular effect for students in grades 4 to 6 when the
subtraction is quite simple. However, we expect that the selection of the answers given by the
students might greatly vary if the question demanded more complicated calculations.
CONCLUSION
In order to conduct a study regarding a comparison of students economic understanding among
countries, researchers need to translate original text into their own languages if English is not their
primary language. For instance, many studies administer a U.S.-standardized test to Korean students and interpret the differences on mean scores as different levels of economic understanding,
partly due to different economics curricula or content in textbooks.
This study, however, casts doubt on these conclusions. The assessment results could differ
based on how the researcher translates the English-written exam. In order to identify the effects
of a translation bias on the assessment results, two differently translated versions of the BET were
given to two homogeneous groups of elementary school students in Korea.
First, we found some questions that yielded significantly different distributions of the chosen
answers depending upon two different translations: The first was a literal translation from English
to Korean; the second was a translation that reflected differences in syntax, vocabulary, and the
like, between the two languages. We identified some sources from the test results that possibly
brought about such a translation bias. The reasons why the two different translated versions
yielded significant differences in the distributions of the chosen answers can be categorized as
follows: In the second version, the text was translated into the Korean usage of expressions rather
than word-for-word, examples and cases familiar to Korean students were presented instead of
those in the original BET, the construction of sentences and phrases was altered into a form that
was appropriate to Korean, and the meanings of economic terms that are not included in the
Korean academic curriculum were described instead of being literally translated.

Downloaded by [185.5.152.206] at 14:36 11 November 2014

EFFECTS OF TRANSLATION BIAS

147

Second, we also found some questions that yielded great differences in the percentages of
correct answers without showing significant differences in the distributions of chosen answers.
These differences tend to distort the results of international comparison since the comparison
of the students economic understanding is based on the percentage of correct answers and the
score.
Therefore, it is important to carefully interpret the assessment results of international comparisons. Low scores do not necessarily imply that the level of economic understanding of Korean
students is lower than that of U.S. students, and vice versa. The possibility of overestimation
or underestimation indicates that researchers need to carefully translate a standardized test for
a cross-country study. It would be desirable to perform pilot tests with different translations to
see if there is a translation bias and to rewrite questions as necessary. Our study is applicable not
only to international comparisons of economic literacy, but also to any comparison of knowledge
across cultures or languages among students within a single country.
Many challenges remain for future research. We believe that there are more sources that
might bring about translation bias in addition to those addressed in this study. It is worthwhile
to identify those sources and to measure their effects on test results. We also need to expand
the pool of subjects from elementary school students to middle or high school students. Because
the texts used in the BET are simpler and shorter than those in the TEK, TEL, and TUCE, the
influence of a translation bias found from elementary school students might differ for middle or
high school students. Finally, it would be interesting to identify whether a students gender, grade
point average, or language ability affects translation bias.

NOTES
1. The PISA is an international assessment of scholastic performance of 15-year-old school pupils in
OECD member countries. It was first administered in 2000 and is repeated every three years.
2. The TIMSS is an international assessment of the mathematics and science knowledge of fourth- and
eighth-grade students around the world. It was first performed in 1995 and is repeated every five years.
3. We thank Kristin Chase for pointing out a broad range of studies to which our results are applicable.
4. We considered a total of 52 questions because there are 8 common questions in forms A and B. However,
we decided not to include all these questions for our study because of time constraints for elementary
school students. We selected 34 questions that might generate translation bias and chose 3 common
questions to test the homogeneity of the two test groups.
5. We first translated the test by ourselves and sent the benchmark versions of the test to professional
linguists for review. After reflecting on several comments, we finalized the translated versions of the
test.
6. Even though it is a word-for-word translation, we modified the grammatical structure of the English
version according to Korean grammar.
7. A translated copy of the BET is available from the authors upon request.
8. We selected schools by considering both administrative area and region in which they are located (city
versus rural). We tried to select them randomly, although this is not purely random in a statistical sense.
There might be a potential self-selection bias if higher-income schools are more likely to administer a
test.
9. For this study, 243 students and 252 students were asked to take version 1 and version 2, respectively.
However, 8 students in version 1 and 1 student in version 2 scored less than 20 out of 100. We felt
that these students did not show adequate effort on the test and dropped them from our sample in order
to preserve the accuracy of our measurements. However, this change does not qualitatively affect our
main results.

148

HAHN AND JANG

10. It seems that the question in version 1 is crystal clear in English, as an anonymous referee pointed
out. The amount sellers produce is often translated into one word meaning the amount to sell in
Korean. Likewise, the amount buyers want to buy is translated into one word meaning the amount
to purchase in Korean.
11. The percentage of correct answers for the BET was 36 percent in the United States.
12. We translated profit as gain in version 1 and business profit in version 2 for question 18, in which
profit is not the correct answer. As a result, both the distributions of the answers and the percentages
of the correct answer did not show a significant difference.

Downloaded by [185.5.152.206] at 14:36 11 November 2014

REFERENCES
Asano, T. 2002. Economic literacy of Japanese high school and university students based on TEL3. Korean Journal of
Economic Education 9:193213.
Chizmar, J. F., R. S. Halinski, and B. J. McCarney. 1980. Basic economics test: Forms A and B. New York: Council on
Economic Education.
. 1981. Basic economics test: Examiners manual. New York: Council on Economic Education.
Hahn, J. 2002. Economic knowledge of Korean elementary school students: Relative weakness and contributing personal
factors. Journal of Curriculum and Evaluation 5(1):16375.
. 2006. Measuring the economic knowledge of Korean elementary students. Korean Journal of Economic Education
13(2):89113. (in Korean)
Kim, K. 1993. Economic understanding of Korean high school students. Seoul, South Korea: Korea Development Institute
(KDI). (in Korean)
. 1994. The level of economic knowledge of Korean middle school students. Seoul, South Korea: Korea Development Institute (KDI). (in Korean)
Kim, S. 2008. International comparison of university students knowledge of economics: Korea, the United States, and
Japan. Korean Journal of Economic Education 15(2):6588. (in Korean)
Olsen, R. V., A. Turmo, and S. Lie. 2001. Learning about students knowledge and thinking in science through large-scale
quantitative studies. European Journal of Psychology of Education. 16(3):40320.
Walstad, W. B., and K. Rebeck. 2001. Test of economic literacy: Examiners manual. 3rd ed. New York: Council for
Economic Education.
Walstad, W. B., K. Rebeck, and R. B. Butters. 2010a. Basic economics test: Examiners manual. 3rd ed. New York:
Council for Economic Education.
. 2010b. Test of economic knowledge: Examiners manual. 2nd ed. New York: Council for Economic Education.
Walstad, W. B., and D. Robson. 1990. Basic economics test: Examiners manual. 2nd ed. New York: Council for Economic
Education.
Walstad, W. B., M. Watts, and K. Rebeck. 2007. Test of understanding in college economics: Examiners manual. 4th ed.
New York: Council for Economic Education.
Whitehead, D. J., and T. Halil. 1991. Economic literacy in the United Kingdom and the United States: A comparative
study. Journal of Economic Education 22(2):10110.
Wuttke, J. 2008. Uncertainties and bias in PISA. http://www.messen-und-deuten.de/pisa/ Wuttke2007a.pdf (accessed
August 6, 2011).
Yamaoka, M., W. B. Walstad, M. W. Watts, T. Asano, and S. Abe. 2010. Comparative studies on economic education in
Asia-Pacific region. Tokyo, Japan: Shumpusha Publishing.

You might also like