Session Information
Session 9A, Exploring Assessment Validity
Papers
Time:
2005-09-09
11:00-12:30
Room:
Arts G109
Chair:
Jouni Valijarvi
Contribution
Different countries, including Turkey, are increasingly valuing educational large-scale survey assessments, such as the Third International Mathematics and Science Study Repeat (TIMSS-R), and the OECD Program for International Student Assessment (PISA). These cross-cultural assessments provide a broad perspective for evaluating and improving education. They allow participating countries to compare their educational process, and achievement to that of other participating countries. These large-scale assessments are developed in one language, usually in English, and then items in tests are translated into the languages of participating countries. This process is called adaptation of tests to different languages. If adaptation of a test from the source language to a target language is done improperly, this translation process can produce two different tests having different psychometric properties.In this context, to get a benefit of these assessments, the constructs to be inferred from the measurement of different countries should be identified. In this process it is also worth to analyze whether these are culturally general constructs or include culturally specific characteristics.Differential item functioning (DIF) analyses could be extended to evaluate the translation process, as well as whether items in the tests, measure culturally specific or culturally general characteristics. In the literature, there are several methodologies proposed to evaluate DIF items in a test and consequently, translation fidelity and cultural relevance of the item content, such as Linn and Harnisch method, logistic regression analysis, likelihood ratio tests, weighted area indices, restricted factor analysis, and Mantel-Haenzsel method. Some methods provide promising results to evaluate the items administered in different languages, such as Mantel-Haenzsel (MH) method, on the other hand some other methods were not evaluated in depth to understand their contribution to evaluate a cross-cultural and cross-lingual data such as Item Response Theory - Likelihood Ratio (IRT- LR) method.The international projects mentioned above provide invaluable source to understand the impact of different methodologies to assess translation fidelity and cultural relevance. Thus, in the present study it is aimed to compare different methodologies and techniques in understanding their contribution to flagged items, which may have translation and cultural specific problems. Also, as a secondary purpose, the analyses will provide information about the translation fidelity and cultural relevance of the items between Turkish and English versions of the tests. DIF will be studied using Item Response Theory - Likelihood Ratio (IRT-LR) method and Mantel- Haenzsel method.IRT-LR method produces chi-square values to test whether the compact and augmented models are significantly different. The compact model restricts some parameters of items to be equal in both groups. The augmented model includes all of the parameters of the compact model, but in addition allows at least one of the restricted item parameters of compact model to vary between groups. Then, testing whether it is worth to estimate two different parameters for an item from two different groups, is also testing whether this item functions differentially between groups.MH method use contingency tables, to produce MH D-DIF indices for each item, indicating whether the item is DIF Free or not. The absolute value of MH D-DIF indices also specifies whether the item shows low or high DIF.Before conducting DIF analysis methods, it must be assured that two test forms share a common construct. This is tested via factor analytic techniques with the LISREL (8.54) program. To get the IRT-LR and MH D-DIF statistics, MULTILOG (7.03) and EZDIF programs are used.This study provides information about differences and similarities in mathematics achievement across different language and cultural settings through the use of IRT-LR and MH methods. Also, identifying factors associated with DIF may contribute to developing valid assessment instruments, by generating test development guidelines.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.