Comparison Between Finland’s Finnish and Swedish PISA Test - How Comparable Are They?

Author(s):

Inga Arffman(presenting / submitting)

Conference:

ECER 2014

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 03 A, Comparing Large-Scale Assessments across Countries and Domains: Issues in Interpretation and Adaptation Procedures

Paper Session

Time:

2014-09-02

17:15-18:45

Room:

B010 Anfiteatro

Chair:

Jana Strakova

Contribution

International achievement tests, such as PISA, are conducted to compare students’ academic performance. However, an important requirement for the results of these tests to be valid is that all different-language versions of the tests are comparable, or equivalent, to each other, equally easy or difficult to respond to. This has to be the case, not only between countries, but also within them. For example, in Finland, where there are two official languages, Finnish (the most commonly used language) and Swedish (a minority-language), the versions in these two languages need to be equivalent to each other. If this is not the case, valid comparisons cannot be made between the two subgroups.

Rigorous quality-monitoring practices have been developed to ensure that translations (e.g., Sweden’s Swedish, or SS, and Finland’s Finnish translations) in international assessments are equivalent to each other (e.g., International Test Commission, 2010). At the same time, no standardized procedure has existed for adapting tests for linguistic minorities (e.g., Finnish Swedish population). Mainly, however, minority-language versions have been made by borrowing a translation from some other country (e.g., Sweden’s Swedish translation) and making only a few cultural adaptations to it (Ercikan, Simon & Oliveri, 2013, p. 116; OECD, 2012).

Very little research has been made on these adapted versions and their quality compared to that of translated majority-language versions. However, the research that exists suggests that adapted versions (e.g., Finland’s Swedish version) have often not been fully comparable with (translated) majority-language versions (e.g., Finland’s Finnish translation; OECD, 2009, pp. 96-103; see also Blum, Goldstein & Guérin-Pace, 2001). Rather, they have resembled more the versions from which they have been adapted (e.g., Sweden’s translation). Also, Finland’s Swedish PISA tests have been found to be of better quality than the corresponding Swedish tests, because the latter have often contained more errors and clumsier language, for example (Arffman, 2012, p. 62; cf. Solano-Flores, 2006, pp. 2366-7; Solano-Flores, 2012).

The purpose of the study was to compare the quality of an adapted minority-language version, Finland’s Swedish version, of the PISA 2012 problem-solving test with a translated majority-language version, Finland’s Finnish translation, of the same test and to examine to what extent they were comparable to each other. In the end, the study aimed at improving adaptation procedures in international achievement tests and adding to the validity of the results of these tests.

Method

The test analyzed was the PISA 2012 computer-based problem-solving test, which consisted of 16 units and 42 items. Four different-language versions of the test were compared linguistically: not only Finland’s Finnish and Swedish versions, but also Sweden’s translation, because Finland’s Swedish version was made on the basis of this version (by making adaptations to it), and the English version, because it was the original version from which all versions were translated and to which they were to be comparable. The expressions used in the versions were analyzed and compared against the cognitive processes required to respond to the items. The purpose was to find out whether there were linguistic differences between the versions that could have affected students’ responding to and performance on the test. Special attention was paid to factors that are known to affect cognitive burden, such as familiarity, naturalness of expression, order of information, and linguistic cues (e,g., Alderson, 2000; Allalouf, 2003; Elosua & López-Jaúregui, 2007; Gierl & Khalik, 2001; Kirsch, 2001; Lenzner, Kaczmirek & Lenzner, 2010; Solano-Flores, Backhoff, & Contreras-Niño, 2009). On the basis of the analysis, four main categories of differences were formed, and all differences were coded and classified into these categories. The analysis and classification were done independently by two researchers. Disagreements between researcher judgments were discussed and resolved by consensus.

Expected Outcomes

In the analysis, four main categories of differences were found which could have affected cognitive burden and student performance: (1) mistakes in Finland’s Swedish version, such as linguistic mistakes, logical errors and missing cultural adaptations, made while adapting Sweden’s translation for use with Finland’s Swedish population; (2) mistakes in Finland’s Swedish version, such as grammatical mistakes, semantically vague or faulty expressions, and logical errors, transferred as such from Sweden’s translation; (3) overly free and more natural and explicit renderings in Finland’s Finnish translation, and overly literal and cumbersome translations in Finland’s Swedish version, transferred as such from Sweden’s translation; and (4) linguistic mistakes and problems, such as overly literal and abstract translations, and unspecific expressions, in Finland’s Finnish translation. All in all, both Swedish-language versions contained a lot of errors and clumsy renderings, whereas Finland’s Finnish translation was of high quality and at places even too easy to respond to. Thus, the versions were not comparable. Rather, Finland’s Finnish translation was easier to respond to than Finland’s Swedish version; it was even easier than Sweden’s translation and (at places) the English source version. The results suggest that there were problems in Sweden’s translation practices and in Finland’s adaptation practices. Suggestions are given on how to improve translation and adaptation practices in international achievement studies so as to increase comparability between all the different-language versions used in these tests.

References

Alderson, C. (2000). Assessing reading. Cambridge: Cambridge University Press. Allalouf, A. (2003). Revising translated differential item functioning items as a tool for improving cross-lingual assessment. Applied Measurement in Education, 16 (1), 55-73. Arffman, I. (2012a). Translating international achievement tests: Translators' view (Finnish Institute for Educational Research, Reports 44). Jyväskylä: Finnish Institute for Educational Research. Retrieved March, 21, 2013, http://ktl.jyu.fi/img/portal/22708/g044.pdf. Blum, A., Goldstein, H., & Guérin-Pace, F. (2001). International Adult Literacy Survey (IALS): An analysis of international comparisons of adult literacy. Assessment in Education, 8 (2), 225-246. Elosua, P., & López-Jaúregui, A. (2007). Potential sources of differential item functioning in the adaptation of tests. International Journal of Testing, 7 (1), 39-52. Ercikan, K., Simon, M., & Oliveri, M. (2013). Score comparability of multiple language versions of assessments within jurisdictions. In M. Simon, K. Ercikan & M. Rousseau (Eds.), Improving large-scale assessment in education. Theory, issues, and practice (pp. 110-124). New York: Routledge. Gierl, M. J., & Khaliq, S. (2001). Identifying sources of differential item and bundle functioning on translated achievement tests: A confirmatory analysis. Journal of Educational Measurement, 38 (2), 164-187OECD. (2009). PISA 2006 technical report. Paris: Author. Retrieved, March, 21, 2013, http://www.oecd.org/dataoecd/0/47/42025182.pdf. International Test Commission (2010). International test commission guidelines for translating and adapting tests. Retrieved, March, 21, 2013, http://www.intestcom.org/upload/sitefiles/40.pdf. Lenzner, T., Kaczmirek, L., & Lenzner, A. (2010). Cognitive burden of survey questions and response times: A psycholinguistic experiment. Applied Cognitive Psychology, 24, 1003–1020. OECD. (2012). PISA 2009 technical report. PISA, OECD Publishing. Retrieved, March, 21, 2013, http://dx.doi.org/10.1787/9789264167872-en. Solano-Flores, G., Backhoff, E., & Contreras-Niño, L. (2009). Theory of test translation error. International Journal of Testing, 9, 78-91. Solano-Flores, G., & Gustafson, M. (2013). Academic assessment of English language learners. A critical, probabilistic, systemic view. In M. Simon, K. Ercikan & M. Rousseau (Eds.), Improving large-scale assessment in education. Theory, issues, and practice (pp. 87-109). New York: Routledge.

Author Information

Inga Arffman (presenting / submitting)

University of Jyväskylä

Finnish Institute for Educational Research

Jyväskylä

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.