Translation

Journal of Evaluation in Clinical Practice ISSN 1365-2753
Translation, adaptation and validation of instruments or

scales for use in cross-cultural health care research:
a clear and user-friendly guideline jep_1434 268..274
Valmi D. Sousa PhD RN and Wilaiporn Rojjanasrirat PhD RNC IBCLC2

1
1
Associate Professor, The University of Kansas School of Nursing, Kansas City, Kansas, USA
2
Associate Professor, Graceland University School of Nursing, Independence, Missouri, USA
Keywords Abstract
back-translation, cross-cultural validation,
health care research, translation Rationale, aims and objectives The diversity of the population worldwide suggests a
great need for cross-culturally validated research instruments or scales. Researchers and
Correspondence clinicians must have access to reliable and valid measures of concepts of interest in their
Valmi D. Sousa own cultures and languages to conduct cross-cultural research and/or provide quality
The University of Kansas School of Nursing patient care. Although there are well-established methodological approaches for translat-
3901Rainbow Boulevard ing, adapting and validating instruments or scales for use in cross-cultural health care
Kansas City, KS 66160 research, a great variation in the use of these approaches continues to prevail in the health
USA care literature. Therefore, the objectives of this scholarly paper were to review published
E-mail: vsousa@kumc.edu recommendations of cross-cultural validation of instruments and scales, and to propose and
present a clear and user-friendly guideline for the translation, adaptation and validation of
Accepted for publication: 3 February 2010 instruments or scales for cross-cultural health care research.
Methods A review of highly recommended methodological approaches to translation,
doi:10.1111/j.1365-2753.2010.01434.x adaptation and cross-cultural validation of research instruments or scales was performed.
Recommendations were summarized and incorporated into a seven-step guideline. Each
one of the steps was described and key points were highlighted. Example of a project using
the proposed steps of the guideline was fully described.
Conclusions Translation, adaptation and validation of instruments or scales for cross-
cultural research is very time-consuming and requires careful planning and the adoption of
rigorous methodological approaches to derive a reliable and valid measure of the concept
of interest in the target population.
Introduction Background and significance

Globalization and migration have contributed to an increasing The increase in diverse populations worldwide and the need for
diversity of the population in many countries, particularly in the cross-cultural and multinational research indicate a great need for
USA, where the background of the population is extremely clinicians and researchers to have access to reliable and valid
diverse regarding culture, language and ethnicity [1,2]. Therefore, instruments or measures cross-validated among diverse cultural
there is a need for relevant cross-cultural research to addresses a segments of the population and/or in other languages [3,4,6]. This
number of problems among these multinational and multicultural would enhance the validity, the generalization and the translation
populations. However, health care researchers conducting cross- of cross-cultural health care research. Although there are well-
cultural studies must have access to reliable and cross-validated established methodological approaches for translating, adapting
instruments in other cultures and/or in other languages [3–6]. and validating instruments for use in cross-cultural health care
Findings from cross-cultural research may have great clinical research [2–5,7–15], a great variation in the use of these
implications for physicians, nurses and other health care profes- approaches continues to prevail in the health care literature.
sionals who provide care for diverse populations because the A recent review of 47 methodological studies focusing on the
delivery of quality care depends on the accurate assessment and translation and validation of instruments for cross-cultural
deeper understanding of an individual’s cultural, linguistic and research reported that the quality and methodological approaches
ethnic background. of the reviewed studies varied greatly [16]. There was no clear
268 © 2010 Blackwell Publishing Ltd, Journal of Evaluation in Clinical Practice 17 (2011) 268–274
V.D. Sousa and W. Rojjanasrirat Validation of instruments or scales
consensus among researchers on how the approaches should be approaches, we have incorporated the most recommended
used or combined, a great variation on the qualifications of trans- ones in a user-friendly guideline to facilitate adoption, consis-
lators, and a lack of detailed information about the translation, tency and use.
back-translation, validation, testing, and revision and refinement
of the instruments.
Corroborating these findings, another researcher [2] reported Step 1: translation of the original instrument
that, unfortunately, translating, adapting and cross-culturally into the target language (forward translation
validating research an instrument is treated as an unimportant or one-way translation)
step of study protocols. Furthermore, the most commonly used The instrument in the source (original) language is forward
and reported methodological approach was the forward trans- translated to the TL (target language) by at least two independent
lation only, often using an unqualified translator. In addition, translators, preferably certified, whose mother language is the
the same author [2] reported that only a few researchers have desired TL of the instrument. The translators must be bilingual
described the use of strategies and steps or processes for the (i.e. fluent in the source and desired TL of the instrument) and
adaptation and/or validation of the instruments, and empha- preferably bicultural (i.e. having in-depth experience in the culture
sized that it is not sufficient to forward translate an instrument of the source and desired TL of the instrument). In addition, the
without carefully evaluating its adaptation and cross-cultural two translators must have distinct backgrounds. The first translator
validation. The procedure should consist of a comprehensive must be knowledgeable about health care terminology and the
process that involves not only translation of an instrument, content area of the construct of the instrument in the desired TL.
but also thorough evaluation of its adaptation and cross-cultural The second translator must be familiar with colloquial phrases,
validation. health care slang and jargon, idiomatic expressions, and emotional
These data suggest that despite the existing recommendations terms in common use in the desired TL. The second translator
and guidelines to use a comprehensive multistep process for should not be knowledgeable about medical terminology and/or
translating, adapting and cross-validating instruments, research- the construct of the instrument. This approach will generate two
ers have not been doing this. It may be that the methodological translated versions that contain words and sentences that cover
approaches are not clearly presented in a user-friendly format, both the medical and the usual spoken language with its cultural
which makes it difficult for researchers to adopt and follow nuances. Therefore, choosing well-qualified translators is the key
the recommendations. Therefore, the objectives of this scholarly to high-quality translations. If resources are available, translations
paper were to review published recommendations of cross- can also be done by two teams of independent translators (each
cultural validation of instruments and scales, and to propose and team of translators must have the same characteristics as the two
present the highly recommended methodological approaches individual independent translators described above), which may
for translating, adapting and validating instruments for cross- result in higher-quality translations by minimizing the introduc-
cultural health care research in a clear and a user-friendly guide- tion of personal idiosyncrasies when using only two independent
line. This would lead to better understanding and use of translators.
these approaches by health care researchers, particularly nurse
researchers worldwide.
Key points
1 Instrument in the SL → translated to TL (TL1 and TL2) to
produce two forward-translated versions of the instrument.
Methodological approaches 2 Use two bilingual and bicultural translators whose mother
Of the two main categories of translation (symmetrical and asym- language is the desired TL, but who have distinct backgrounds:
metrical), the symmetrical category is the most recommended • One translator must be knowledgeable about health termi-
approach because it refers to faithfulness of meaning and collo- nology and the content area of the construct of the instrument
quialness in both the source language (SL; original language of in the TL.
the instrument) and the target language (TL; desired language) • The other translator must be knowledgeable about the cultural
and not to a literal translation [12]. The purpose of translation is and linguistic nuances of the TL.
to achieve equivalence between the instrument in the SL and the 3 Two independent teams of translators can also be used (each
instrument in the TL [2]. The symmetrical translation is the only team of translators must have the same characteristics of the two
category that facilitates the comparison of responses from indi- individual independent translators).
viduals of one culture to those of another [6,12,13] and the deter-
mination of the most relevant types of cross-cultural equivalence:
Step 2: comparison of the two translated
the semantic, conceptual, content, technical and criterion [1,17].
versions of the instrument (TL1 and
In addition, the process known as centring [7], in which both
TL2): synthesis I
the SL and the TL of an instrument are equally important, should
be used. The instructions, the items and the response format of the two
The process of translation, adaptation and cross-cultural forward-translated versions of the instrument (TL1 and TL2) and
validation of an instrument for use in other cultures, languages both the TL1 and the TL2 with the original version of the instru-
and countries requires careful planning and adoption of com- ment in the SL are initially compared by a third bilingual and
prehensive, rigorous and most established methodological preferably bicultural independent translator regarding ambigui-
approaches [2–5,7–15]. Because there are variations among these ties and discrepancies of words, sentences and meanings. Any
© 2010 Blackwell Publishing Ltd 269

Validation of instruments or scales V.D. Sousa and W. Rojjanasrirat
ambiguities and discrepancies must be discussed and resolved 3 Two independent teams of translators can also be used (each
using a committee approach. Consensus should be achieved with team of translators must have the same characteristics of the two
the participation of the third translator, the two translators from individual independent translators).
Step 1, and the investigator and/or other members of the research
team. This process will generate the preliminary initial translated
Step 4: comparison of the two back-translated
version of the instrument in the TL (PI-TL).
versions of the instrument (B-TL1 and B-TL2):
synthesis II
Key points
Initially, the instructions, items and response format of the two
1 Use a third independent translator to compare the TL1 and TL2,
back-translations (B-TL1 and BTL2) are compared by a multi-
and to compare both the TL1 and TL2 with the SL version of the
disciplinary committee with the instructions, items and response
instrument.
format of the original instrument in the SL regarding format,
2 Use a committee approach (third independent individual or
wording, and grammatical structure of the sentences, similarity
translator, translators who participated in Step 1, and investigator
in meaning, and relevance. It is highly recommended that the
and/or other members of research team) to resolve ambiguities and
committee should include at least one methodologist (who can be
discrepancies and derive the PI-TL)
the investigator and/or a member of the research team), one health
care professional who is familiar with the content areas of the
Step 3: blind back-translation (blind backward construct of the instrument, and all four bilingual and bicultural
translation or blind double translation) of translators involved in Step 1 (forward translation of the instru-
the preliminary initial translated version ment into the TL) and Step 3 (back-translation of the instrument
of the instrument from the TL into the SL). It is also recommended that the devel-
oper of the original instrument in the SL participated and provide
The PI-TL is translated back into the SL by two other independent
insights on the construct of the instrument and clarify any ques-
translators with the same qualifications and characteristics
tions that might arise. Having at least one monolingual committee
described above in Step 1. For this step, the translators mother
member whose mother language is the TL of the instrument would
language should be the SL of the original instrument, and they
enhance the quality of the pre-final version of the translated in-
should be completely blind to the original version of the instru-
strument. Any ambiguities and discrepancies regarding cultural
ment (they had never seen the original version of the instrument).
meaning and colloquialisms or idioms in words and sentences of
They will produce two back-translated versions of the instrument.
the instructions, the items, and the response format between the
Again, the first translator must be knowledgeable about health care
two back-translations (B-TL1 and B-TL2) and between each one
terminology and the content area of the construct of the instrument
of the two back-translations (B-TL1 and B-TL2) and the original
in the SL, but no prior knowledge of the instrument being back-
instrument in the SL are discussed and resolved through consensus
translated. The second translator must be familiar with colloquial
among the committee members to derive a pre-final version of
phrases, health care slang and jargon, idiomatic expressions,
the instrument in the TL (P-FTL).
and emotional terms in common in the SL. The second translator
If discrepancies cannot be resolved, it may be necessary to
should not be knowledgeable about medical terminology and/or
repeat Steps 1 though 4: two other independent bilingual and
construct of the instrument and should have no prior knowledge of
bicultural translators must be used to translate the original instru-
the instrument being back-translated as well. If resources are avail-
ment (SL) again to generate two translations, and two other inde-
able, back-translation can also be done by two teams of translators,
pendent bilingual and bicultural translators must be used to back-
which may result in higher-quality back-translations by minimiz-
translate the translated versions of the instrument (TL) following
ing the introduction of personal idiosyncrasies when using only
the same procedure described above (known as the repetition
one independent back-translator to generate each initial back-
approach). Alternatively, only items that do not retain their original
translated version of the instrument. This process will result in
meaning are re-translated and back-translated. The evaluation of
two back-translated versions of the instrument in its original
the translated and back-translated versions follows the same vali-
SL (B-TL1 and B-TL2). This step allows for clarification of
dation process described above. This process is repeated until no
words and sentences used in the translations. As noted in Step 1,
ambiguities or discrepancies are found.
choosing well-qualified translators is the key to high-quality
These methodological approaches of Step 4 will establish the
back-translations.
initial conceptual, semantic and content equivalence of the P-FTL.
Conceptual equivalence refers to the degree to which a concept of
Key points the items of the instrument exists in both the source and target
1 The PI-TL → Back-translated to SL (B-TL1 and B-TL2) to cultures. Semantic equivalence refers to sentence structure, collo-
produce two back-translated versions. quialisms or idioms that ensure that the meaning of the text or idea
2 Use two bilingual and bicultural translators whose mother of the items of the instrument in the SL is present in the TL.
language is the SL, but who have distinct backgrounds: Finally, content equivalence refers to the relevance and pertinence
• One translator must be knowledgeable about health termi- of the text or idea of the items of the instrument in each culture.
nology and the content area of the construct of the instrument The committee’s role is to evaluate, revise and consolidate the
in the SL. instructions, items and response format of the back-translated
• The other translator must be knowledgeable about the cultural instruments that have conceptual, semantic and content equiva-
and linguistic nuances of the SL. lency and to develop the P-FTL for pilot and psychometric testing.
270 © 2010 Blackwell Publishing Ltd

language clearer. Instructions, response format and items of

Key points
the instrument that are found to be unclear by at least 20% of the
1 Comparison between the two back-translations (B-TL1 and
committee members must be revised and re-evaluated [19]. The
B-TL2) of the instrument, and between both BTL1 and B-TL2 and
minimum inter-rater agreement among the experts panel is 80%).
the original SL instrument:
This process will further determine the conceptual equivalence
• Evaluate similarity of the instructions, items and response of the translated instrument.
format regarding wording, sentence structure, meaning and
The expert panel is then asked to evaluate each item of the
relevance.
instrument for content equivalence (content-related validity [rel-
2 Use a multidisciplinary committee:
evance]) using the following scale: 1 = not relevant; 2 = unable
• One methodologist (researcher or a member of the research to assess relevance; 3 = relevant but needs minor alteration;
team).
4 = very relevant and succinct. Items classified as 1 (not relevant)
• One health care professional. or 2 (unable to assess relevance) should be revised [20,21].
• All four bilingual and bicultural translators used in Step 1 and Content validity index at the item level (I-CVI) and at the scale
Step 3: two translators whose mother language is the desired TL
level (S-CVI) should be calculated. There are three methods to
of the instrument and two translators whose mother language
calculate S-CVI [20–22], but the averaging calculation (S-CVA/
is the SL of the original instrument.
Ave) method is preferred [22]. Using 10 experts, the I-CVI of
3 If possible, developer of the original instrument should parti-
0.78 or above [20] and S-CVA/Ave of 0.90 or above [21] are
cipate in the discussions.
the minimum acceptable indices. Items that do not achieve the
4 If ambiguities and discrepancies cannot be resolved, Steps 1
minimum acceptable indices are revised and re-evaluated. New
through 4 may be repeated as many times as necessary. Alter-
content validity indices are calculated. The process continues until
natively, only items that do not retain their original meaning are
acceptable indices of content-related validity or content equiva-
re-translated and back-translated.
lence are achieved. It is also recommended that the kappa coeffi-
cient of agreement be determined to increase confidence in the
content validity of the instrument [23]. A kappa of 0.60 is generally
Step 5: pilot testing of the pre-final version of
the minimum acceptable coefficient to determine good agreement
the instrument in the target language with a
[24]. The purpose of Step 5 is to continue developing the P-FTL
monolingual sample: cognitive debriefing
for pre-field test for preliminary and/or full psychometric testing.
The P-FTL is pilot tested among participants whose language is
the TL of the instrument to evaluate the instructions, response Key points
format and the items of the instrument for clarity. Participants 1 Pilot test of the P-FTL among individuals whose language is the
should be recruited from the target population in which the instru- TL of the instrument:
ment will be used (e.g. if the instrument measures self-care among • Evaluate the instructions, items and response format clarity.
individuals with type 2 diabetes, then the sample must consist of • Use a sample size of 10–40 participants.
individuals with type 2 diabetes). A sample size of 10–40 individu- 2 It is highly recommended to use an expert panel to further
als is recommended [3,18]. Each participant is asked to rate the examine the instrument for:
instructions and items of the scale using a dichotomous scale • Clarity of the instructions, items and response format.
(clear or unclear). Participants who rate the instructions, response • Content equivalence (content-related validity) using I-CVI,
format or any item of the instrument as unclear are asked to provide S-CVI/Ave and Kappa coefficient of agreement.
suggestions as to how to rewrite the statements to make the lan- • Use a sample of 6–10 experts (10 experts are preferred).
guage clearer. Instructions, response format and items of the instru-
ment that are found to be unclear by at least 20% of the sample must
Step 6: preliminary psychometric testing of the
be re-evaluated [19]. Therefore, the minimum inter-rater agreement
pre-final version of the translated instrument
among the sample is 80%. This step is used to further support the
with a bilingual sample
conceptual, semantic and content equivalency of the translated
instrument and further improve the structure of sentences used This step is rarely used; however, when a bilingual population is
in the instructions and items of the P-FTL to be easily understood accessible, it is recommended that the instrument be pre-field
by the target population prior to psychometric testing. tested among bilingual individuals (fluent in the SL of the original
To further determine the conceptual and content equivalence of instrument and the TL of the translated instrument). If this is not
the items of the P-FTL, use of an expert panel, is highly recom- possible, skip this step and move to Step 7. In general, the recom-
mended. The instructions, response format and the items of the mendation is to use at least five subjects per item of the instrument
instrument are evaluated for conceptual equivalence (clarity) by to conduct the preliminary psychometric testing of a new instru-
six to ten members of an expert panel [20,21] who are knowledge- ment [25]. Ideally, the bilingual sample should be from the target
able about the content areas of the construct of the instrument population in which the instrument will be used (e.g. adult indi-
and the target population in which the instrument will be used viduals with type 2 diabetes, African–American women with heart
and whose mother language is the TL of the instrument. When failure). However, in many instances, this may be difficult and
possible, a committee of 10 members is preferable [20,21]. Each unrealistic; thus other alternatives can be used such as sampling
member of the committee who rates the instructions, response bilingual college students and faculty or workers in travel agen-
format or any item of the instrument as unclear is asked to provide cies, currency exchange agencies, international trade companies,
suggestions as to how to rewrite the statements and make the embassies and consulates, and language schools.

Initially, participants are given the P-FTL and are asked to reliability (or sensitivity and specificity); (2) stability reliability
answer the items. The participants respond to the items of the (test–retest reliability); (3) homogeneity; (4) construct-related
P-FTL without seeing the original instrument in the SL. After validity such as convergent and/or divergent (discriminant) valid-
completion of the P-FTL, participants are given the original in- ity; (5) criterion-related validity such as concurrent and/or
strument in the SL and are asked to answer the items. They may predictive validity; (6) factor structure of the instrument (dimen-
complete a demographic questionnaire and/or other instruments of sionality); and (7) model fit. Although, it is not the purpose of this
interest. The order of the items of the original instrument must be user-friendly guideline to describe the many statistical approaches
mixed to be in a different order from that of the items of the P-FTL. that can be used in Step 7, the most common statistical approaches
Responses on both versions of the instrument are then compared are scale and item analysis, Pearson’s correlation analysis, explor-
(i.e. interpretation of scores is the same in both cultures) to estab- atory factor analysis and confirmatory factor analysis. The purpose
lish criterion equivalency (a type of construct validity). Statistical of the Step 7 is to revise and refine the items of the P-FTL as
analyses used for comparison purposes may consist of descriptive needed to derive the final psychometrically sound FTL consisting
statistics, correlation coefficients, and paired t-test or one-way of adequate estimates of reliability, homogeneity, and validity and
anova. Scale and item analysis is also used to establish the initial with a stable factor structure and/or model fit.
preliminary psychometric properties of the instrument (internal
consistency reliability) and to compare these properties of the Key points
P-FTL with the SL of the original instrument. When the instrument 1 Full psychometric testing of the P-FTL among individuals from
purpose is to serve as a diagnostic or screening testing, preliminary the target population to:
calculation of sensitivity and specificity is recommended. This • Revise and refine the items of the final version of the instru-
Step 6 also determines initial technical equivalency (the method ment in the TL.
of assessment) and is useful to support the conceptual, semantic, • Establish internal consistency reliability (or sensitivity and
content and construct validity of the P-FTL prior to conducting full specificity), stability reliability, homogeneity, construct-related
psychometric field testing. validity, criterion-related validity, factor structure and model fit
of the instrument.
Key points 2 Use at least 10 subjects per item of the instrument for general
1 When possible, pilot test the P-FTL among bilingual individuals psychometric approaches (scale and item analysis, Pearson’s cor-
to: relations and exploratory factor analysis).
• Compare the P-FTL and the SL instrument in the SL. 3 Use 300–500 subjects for confirmatory factor analysis or
• Establish criterion equivalency and further support the conduct a power analysis.
conceptual, semantic, content and construct equivalency of the
P-FTL. Example of a project to translate, adapt
2 Use at least five subjects per item of the instrument. and validate a research instrument
3 Subjects complete the P-FTL first without seeing the original A project to translate, adapt and cross-validate an instrument for
instrument in the SL. cross-cultural research may take several years; and it is normally
4 Subjects complete the original instrument in the SL in which conducted using more than one study to adhere to the recom-
items have been mixed in different order from the P-FTL. mended methodological approaches described above. One study
might set as its initial goal to translate, adapt and cross-validate a
Step 7: full psychometric testing of the research instrument using Steps 1, 2, 3, 4 and 5 only. In a second
pre-final version of the translated instrument study, the researchers might set a single goal to establish the
in a sample of the target population preliminary psychometrics of the translated instrument with
bilingual participants using Step 6. Then, in a third study, the
This last step is used to establish the initial full psychometric
researchers’ goal might be to establish the initial full psychometric
properties of the newly translated, adapted and cross-validated
properties of a translated instrument in a sample of the target
instrument with a sample of the target population of interest. The
population of interest using the approaches described in Step 7.
sample size for this step depends on the types of psychometric
Depending on the psychometric approaches used in this third
approaches that will be used. The more complete the psychometric
study, other studies might be necessary to continue the develop-
approaches for evaluation of the translated instrument the more
ment and psychometric evaluation of the translated instrument. To
confidence will be generated in its reliability and validity proper-
illustrate the use of the methodological steps to translate, adapt and
ties. In general, per rule of thumb, it is highly recommended to use
cross-validate an instrument and to evaluate the preliminary and
at least 10 subjects per item of the instrument scale and item
initial full psychometric properties of an instrument, we present
analysis and exploratory factor analysis [25–28]. If there is a plan
an example of a project that used two studies to adhere to the
to use confirmatory factor analysis to test the factor structure of the
recommended guideline.
instrument, the recommendation per rule of thumb is approxi-
mately 300–500 subjects per item of the instrument [28,29]. Power
Study one: cross-cultural equivalence and
analysis based on the number of degrees of freedom, an alpha level
psychometric properties of the Portuguese
(0.05 or 0.01), and a desired power (80% or above) can also be
version of the depressive cognition scale
calculated [30,31].
The most recommended and commonly used psychometric In this study [6], the depressive cognition scale (DCS) was trans-
approaches in this step are estimation of: (1) internal consistency lated from English into Portuguese using Steps 1 and 2, and

back-translated from Portuguese into English and cross-culturally 0.32 to 0.75 on the ECD and from 0.29 to 0.77 on the DCS). There
validated using Steps 3 and 4. Preliminary psychometric evalua- was no statistically significant difference between the overall mean
tion of the scale was conducted with a bilingual sample using Step score of the ECD (M = 5.05, SD = 3.49) and the overall mean
6. Note that Step 5, important to determine clarity of the instruc- score of the DCS (M = 5.20, SD = 3.56); t(1,39) = 0.76, P = 0.45.
tions, response format and sentence structure of the items, was At the item level, a statistically significant difference was found
not done because the committee was comprised of three members between the mean score of item 2 on the ECD (M = 0.80,
whose mother language was Portuguese who were convinced that SD = 0.76) and the mean score of item 2 on the DCS (M = 0.90,
all those aspects of the scale were completely clear. The DCS was SD = 0.74); t(1,39) = 2.08, P = 0.04. The findings of Step 6 sug-
originally developed in English to measure depressive cognitions gested adequate criterion equivalence and supported the concep-
[32]. The theoretical basis for the development of the scale was tual, semantic and content validity of the ECD. The ECD had an
Beck’s theory of depression and Erickson’s theory of psychosocial adequate internal consistency reliability, homogeneity and initial
development. The DCS consists of eight items on a 6-point Likert- support for construct validity. The full psychometric properties of
type scale ranging from 0 (strongly disagree) to 5 (strongly agree). the ECD could be further tested among individuals from the target
Each item of the scale measures a specific cognition: hopeless- population of interest.
ness, helplessness, purposelessness, powerlessness, worthlessness,
loneliness, emptiness and meaninglessness.
Study two: psychometric properties of the
Using Steps 1 through 4, the DCS English version was trans-
Portuguese version of the depressive cognition
lated into Portuguese by two bilingual translators who were fluent
scale in Brazilian adults with diabetes mellitus
in both English and Portuguese languages to generate two versions
of the translated scale. The Portuguese versions of the scale were The second study [33], was conducted to continue the psycho-
blindly back-translated into English by two different translators metric evaluation of the ECD using Step 7. The purpose of the
who never saw the original version of the DCS in English. The two study was to evaluate the full psychometric properties of the ECD
versions of the translated scale were compared with each other, among adults with diabetes mellitus (i.e. target population of inter-
and each one of these versions was compared with the original est). We used the recommended sample size of at least 10 subjects
English version of the scale to determine its conceptual, semantic per item of the scale. The sample consisted of 82 Brazilian adults
and content equivalence using a panel of three Brazilian bilinguals with diabetes mellitus. The estimate of internal consistency reli-
and an American monolingual expert in the content area with ability was a Cronbach’s alpha of 0.88. Scale and item analysis
consultation with the translators who participated in the translation indicated that the inter-item correlation coefficients ranged from
and back-translation of the versions of the scale. Ambiguities and 0.37 to 0.69 and the item-to-correlation coefficients ranged from
discrepancies regarding conceptual and semantic equivalence on 0.51 to 0.71. Exploratory factor analysis indicated that the ECD
two items that measured emptiness and purposelessness were dis- was unidimensional, explaining 56.73% of its item variance, and
cussed and resolved by the committee members. A final version of had factor loadings ranging from 0.60 to 0.80 and communality
the translated scale in Portuguese was derived and named ‘Escala values ranging from 0.36 to 0.67. In addition, the ECD total score
Cognitiva de Depressão (ECD).’ was statistically significantly correlated with the Portuguese
As we stated previously, we skipped Step 5 and proceeded to version of the Beck Depression Inventory (r = 0.24, P < 0.05).
Step 6 to evaluate the preliminary psychometrics of the Portuguese The study findings further supported the reliability, homogeneity
version of the scale (ECD), with a bilingual sample of individuals and construct-related validity of the ECD among Brazilians with
from the general population, mostly students and faculty members diabetes mellitus.
of a major Brazilian university. This sample was chosen because
of the anticipated difficulty to approach and recruit individuals
from the target population of interest (i.e. individuals with diabetes
Conclusions
mellitus). We used the recommended sample size of five subjects The diversity of the population worldwide shows a great need for
per item of the scale; therefore, 40 bilingual adults participated in cross-culturally validated instruments or scales. Translation, adap-
the study. Using a two-step approach for data collection, partici- tation and validation of an instrument or scale for cross-cultural
pants received and responded to the Portuguese version of the research is time-consuming and requires careful planning and
scale (ECD) only. Then participants answered a demographic adoption of rigorous methodological approaches to derive a reli-
questionnaire and completed the original English version of the able and valid measure of the concept of interest in the target
scale (DCS). We used descriptive statistics, scale and item analy- population. This scholarly paper has reviewed and incorporated
sis, and paired t-tests to compare the responses of participants in the highly recommended methodological approaches in a clear and
both the Portuguese (ECD) and the English (DCS) items of the user-friendly guideline. The adoption of symmetrical translation
scales. Descriptive statistics showed strong similarities in partici- using centring process will lead to more accurate adaptation and
pants’ responses on both the Portuguese and English versions of cross-cultural validation of a translated instrument. The steps of
the scale (identical responses ranged from 78% to 98%). Pearson’s the proposed guideline provide clear directions to cross-culturally
correlation coefficients between each item of the ECD and each validate research instruments or scales. Choosing the right trans-
item of the DCS ranged from 0.64 to 0.96 and were statistically lators and committee members is key aspect that must be consid-
significant (P < 0.001). Estimates of internal consistency reli- ered to enhance the quality of the translation, back-translation and
ability and homogeneity were similar in both the ECD and DCS cross-validation of an instrument or scale. Pilot testing of a trans-
versions of the scale (Cronbach’s alphas were 0.79 and 0.78, lated instrument or scale among participants whose language is the
respectively; item-to-total correlation coefficients ranged from TL of the instrument to evaluate the instructions, the items and the

response format of the instrument for clarity also enhances the Verjee-Lorenz, A. & Erikson, P. (2005) Principles of good practice for
quality of the final version of the translated instrument. Finally, the translation and cultural adaptation processs for patient-reported
using well-established approaches to test the preliminary psy- outcomes (PRO) measures: report of the ISPOR task force for trans-
chometric properties of the translated instrument or scale among lation and cultural adaptation. Value in Health, 8 (2), 94–104.
16. Maneesriwongul, W. & Dixon, J. K. (2004) Instrument translation
bilingual individuals and full psychometrics of the translated
process: a method review. Journal of Advanced Nursing, 48 (2), 175–
instrument or scale among individuals from the target population 186.
of interest will derive a reliable and valid scale in the TL of 17. Beck, C. T., Bernal, H. & Froman, R. D. (2003) Methods to document
interest. semantic equivalence of a translated scale. Research in Nursing and
Health, 26 (1), 64–73.
18. Sousa, V. D., Hartman, S. W., Miller, E. H. & Carroll, M. A. (2009)
References
New measures of diabetes self-care agency, diabetes self-efficacy, and
1. Hilton, A. & Skrutkowski, M. (2002) Translating instruments into diabetes self-management for insulin-treated individuals with type 2
other languages: development and testing processes. Cancer Nursing, diabetes. Journal of Clinical Nursing, 18 (9), 1305–1312.
25 (1), 1–7. 19. Topf, M. (1986) Three estimates of interrater reliability for nominal
2. Sperber, A. D. (2004) Translation and validation of study instru- data. Nursing Research, 35 (4), 253–255.
ments for cross-cultural research. Gastroenterology, 126 (1), S124– 20. Lynn, M. R. (1986) Determination and quantification of content valid-
S128. ity. Nursing Research, 35 (6), 382–385.
3. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2000) 21. Waltz, C. F., Strickland, O. L. & Lenz, E. R. (2005) Measurement in
Guidelines for the process of cross-cultural adaptation of self-report Nursing and Health Research, 3rd edn. New York: Springer Publishing
measures. Spine, 25 (24), 3186–3191. Company.
4. Beaton, D. E., Bombardier, C., Guillemin, F. & Ferraz, M. B. (2002) 22. Polit, D. F. & Beck, C. T. (2006) The content validity index: are you
Recommendations for the cross-cultural adaptation of health status sure you know what’s being reported? Critique and recommendations.
measures. Available at: http://www.dash.iwh.on.ca/assets/images/ Research in Nursing and Health, 29 (5), 489–497.
pdfs/xculture2002.pdf (last accessed 1 October 2009). 23. Wynd, C. A., Schmidt, B. & Schaefer, M. A. (2003) Two quantitative
5. Bracken, B. A. & Barona, A. (1991) State of the art procedures for approaches for estimating content validity. Western Journal of
translating, validating and using psychoeducational tests in cross- Nursing Research, 25 (5), 508–518.
cultural assessment. School Psychology International, 12 (1–2), 119– 24. Streiner, D. L. & Norman, G. R. (2008) Health Measurement Scales:
132. A Practical Guide for Their Development and Use, 4th edn. New York:
6. Sousa, V. D., Zauszniewski, J. A., Mendes, I. A. C. & Zanetti, M. L. Oxford University Press.
(2005) Cross-cultural equivalence and psychometric properties of 25. Nunnally, J. C. & Bernstein, I. H. (1994) Psychometric Theory, 3rd
the Portuguese version of the depressive cognition scale. Journal of edn. New York: McGraw-Hill.
Nursing Measurement, 13 (2), 87–99. 26. Hair, J. F. J., Anderson, R. E., Tatham, R. L. & Black, W. C. (1998)
7. Brislin, R. W. (1970) Back-translation for cross-cultural research. Multivariate Data Analysis, 5th edn. Englewood Cliffs, NJ: Prentice-
Journal of Cross-Cultural Psychology, 1 (3), 185–216. Hall, Inc.
8. Brislin, R. W. (1986) The wording and translation of research instru- 27. Stevens, J. (2002) Applied Multivariate Statistics for the Social
ments. In Field Methods in Cross-Cultural Research (eds W. J. Lonner Sciences, 4th edn. Mahwah, NJ: Lawrence Erlbaum Associates.
& J. W. Berry), pp. 137–164. Beverly Hills, CA: Sage Publications. 28. Tabachnick, B. G. & Fidell, L. S. (2001) Using Multivariate Statistics,
9. Chapman, D. W. & Carter, J. F. (1979) Translation procedures for the 4th edn. Needham Heights, MA: Allyn & Bacon.
cross cultural use of measurement instruments. Educational Evalua- 29. Gonzalez, R. & Griffen, D. (2001) Testing parameters in structure
tion and Policy Analysis, 1 (3), 71–76. equation modeling: every ‘one’ matters. Psychological Methods, 6 (3),
10. Guillemin, F., Bombardier, C. & Beaton, D. (1993) Cross-cultural 258–269.
adaptation of health-related quality of life measures: literature review 30. MacCallum, R. C., Browne, M. W. & Sugawara, H. M. (1996) Power
and proposed guidelines. Journal of Clinical Epidemiology, 46 (12), analysis and determination of sample size for covariance structure
1417–1432. modeling. Psychological Methods, 1 (2), 130–149.
11. Jones, E. (1987) Translation of quantitative measures for use in cross- 31. MacCallum, R. C., Browne, M. W. & Cai, L. (2006) Testing differ-
cultural research. Nursing Research, 36 (5), 324–327. ences between nested covariance structure models: power analysis and
12. Jones, E. G. & Kay, M. (1992) Instrumentation in cross-cultural null hypotheses. Psychological Methods, 11 (1), 19–35.
research. Nursing Research, 41 (3), 186–188. 32. Zauszniewski, J. A. (1995) Development and testing of a measure of
13. Jones, P. S., Lee, J. W., Phillips, L. R., Zhang, X. E. & Jaceldo, K. B. depressive cognitions in older adults. Journal of Nursing Measure-
(2001) An adaptation of Brislin’s translation model for cross-cultural ment, 3 (1), 31–41.
research. Nursing Research, 50 (5), 300–304. 33. Sousa, V. D., Zanetti, M. L., Zauszniewski, J. A., Mendes, I. A. C. &
14. McDermott, M. A. N. & Palchanes, K. (1994) A literature review of Daguano, M. O. (2008) Psychometric properties of the Portuguese
the critical elements in translation theory. Image: Journal of Nursing version of the depressive cognition scale in Brazilian adults
Scholarship, 26 (2), 113–117. with diabetes mellitus. Journal of Nursing Measurement, 16 (2), 125–
15. Wild, D., Grove, A., Martin, M., Eremenco, S., McElroy, S., 135.

Translation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Translation

Uploaded by

Copyright:

Available Formats

Journal of Evaluation in Clinical Practice ISSN 1365-2753

Translation, adaptation and validation of instruments or

Valmi D. Sousa PhD RN and Wilaiporn Rojjanasrirat PhD RNC IBCLC2

Introduction Background and significance

© 2010 Blackwell Publishing Ltd 269

270 © 2010 Blackwell Publishing Ltd

language clearer. Instructions, response format and items of

© 2010 Blackwell Publishing Ltd 271

272 © 2010 Blackwell Publishing Ltd

© 2010 Blackwell Publishing Ltd 273

274 © 2010 Blackwell Publishing Ltd

You might also like