12 SES 13 B JS, Translation and Cross-cultural Comparability in Large Scale Assessments
Joint Paper Session NW 09 and NW 12
After so many years of research, accumulating evidence shows that in international assessments, such as PISA or TIMSS, factors which are not intended to be measured might have a crucial effect on the test results (Yildirim & Berberoğlu, 2009). Among these factors are cultural and linguistic differences. In line with the findings of our previous research, we had a surprising result which made us think that these possible effects might be larger than what one would expect, especially in some countries like Turkey (Yildirim & Yildirim, 2017).
In this previous research, we were investigating in what ways socio-economical differences among Turkish students effected their PISA 2015 science literacy performances. We conducted separate IRT calibrations for some of the countries and estimated item parameters for each of the countries. To see how similar the item parameter estimates were among the countries, we conducted a multidimensional scaling analysis specifying each one of the item parameter estimates as an attribute for the countries, and considered them as coordinates of points in a high-dimensional space (Heady & Lucas, 1997). From these coordinates we computed the (Euclidean) distances between all pairs of countries and submitted the distance matrix to a multidimensional scaling program that produces a good approximation of the distances in a low-dimensional space. When we mapped countries with respect to the similarity of their item parameter estimates and asked a solution for two-dimensional case, we saw that the item parameter estimates were not equivalent among the countries.
In this current research we aim to reveal some variables which seem to be associated with this nonequivalence of item parameters among the participant countries. To this purpose, we will conduct a profile analysis, which is an innovative technique developed by Verhelst (2012). For the ease of interpretation, before the profile analyses 72 participant countries will be clustered into fewer groups with respect to similarities of estimated item parameter values. In this context we expect to answer the following questions.
How many country clusters would be defined based on similarities among countries' item parameter estimates, and which countries constitute the clusters?
What are the strengths and the weaknesses of countries with respect to the dimensions as defined in the PISA 2015 science assessment framework?
Profile analysis is the technique we will be using to answer the second question. Some information on this technique is as follows.
Profile analysis is a technique based on comparing the expected and the observed performance of individuals on a specific subsets of test items (Verhelst, 2012). The expected performance of individuals are conditional expectations given the total score of individuals and the item parameter estimates. The item parameter estimates to be used at this step will be the values as reported in the PISA 2015 reports. As this values are estimated using the complete set of PISA data, expected performances can be considered as performance with respect to international averages. Thus, possible deviates between individuals' observed and expected performance can be regarded as signs of their strengths (for positive deviations) or weaknesses (for negative deviations) on subset of items specified with respect to some hypothesis. The subset of items in this study will be determined with respect to the specifications defined in the PISA 2015 assessment framework. These item specifications are as follows: Competency (Explain, evaluate, interpret); content (physical, living, earth & space); type of knowledge (content, procedural, epistemic); depth of knowledge (low, medium, high); response type (selected, constructed).
We expect to provide some explanations which may shed light to the possible sources of nonequivalence of items' functioning across the countries. For example it may be that some countries are performing below the expectations on some items not because students' scientific literacy level is low but because students are not accustomed to constructed response type items.
Heady, R. B., & Lucas, J. L. (1997). Permap: An interactive program for making perceptual maps. Behavior Research Methods, Instruments, & Computers, 29(3), 450-455. Verhelst, N. D. (2012). Profile analysis: a closer look at the PISA 2000 reading data. Scandinavian Journal of Educational Research, 56 (3), 315 – 332. Yildirim, H. H. & Berberoĝlu, G. (2009) Judgmental and Statistical DIF Analyses of the PISA-2003 Mathematics Literacy Items, International Journal of Testing, 9:2, 108-121 Yildirim, H.H. & Yildirim, S. (2017). Advantages and Disadvantages of Low and High Socioeconomic Status Students: A Closer Look at Turkey's PISA-2015 Science Data. Paper presented at the 43rd Annual Conference of the International Association for Educational Assessment. Batumi.
00. Central Events (Keynotes, EERA-Panel, EERJ Round Table, Invited Sessions)
Network 1. Continuing Professional Development: Learning for Individuals, Leaders, and Organisations
Network 2. Vocational Education and Training (VETNET)
Network 3. Curriculum Innovation
Network 4. Inclusive Education
Network 5. Children and Youth at Risk and Urban Education
Network 6. Open Learning: Media, Environments and Cultures
Network 7. Social Justice and Intercultural Education
Network 8. Research on Health Education
Network 9. Assessment, Evaluation, Testing and Measurement
Network 10. Teacher Education Research
Network 11. Educational Effectiveness and Quality Assurance
Network 12. LISnet - Library and Information Science Network
Network 13. Philosophy of Education
Network 14. Communities, Families and Schooling in Educational Research
Network 15. Research Partnerships in Education
Network 16. ICT in Education and Training
Network 17. Histories of Education
Network 18. Research in Sport Pedagogy
Network 19. Ethnography
Network 20. Research in Innovative Intercultural Learning Environments
Network 22. Research in Higher Education
Network 23. Policy Studies and Politics of Education
Network 24. Mathematics Education Research
Network 25. Research on Children's Rights in Education
Network 26. Educational Leadership
Network 27. Didactics – Learning and Teaching
The programme is updated regularly (each day in the morning)
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.