Introduction
Turkey, a member of OECD, participates PISA regularly since 2003. Turkey’s performance on mathematics was below average; 423 in PISA 2003, 424 in PISA 2006, 445 in PISA 2009, 448 in PISA 2012, and 420 in PISA 2015 (MEB, 2015; MEB, 2016). Through PISA 2012, Turkey had a trend of increasing their mathematics scores, however, in PISA 2015 the average mathematics score dropped dramatically. The possible reasons of this very low score on PISA 2015 are need to be investigated. One of the reasons could be the psychometric properties of mathematics items that were used in the PISA 2015 assessment. PISA is mainly developed in English first and then adapted to other languages including Turkish (OECD, 2017). Therefore, it is necessary to evaluate whether PISA mathematics items functioned differently for Turkish and English speaking students who answered adapted items and original items, respectively. Finding an evidence for fairness of items in terms of psychometric properties could help to eliminate one of the possible reasons of sharp decrease of Turkish students’ mathematics performance in 2015.
Differential item functioning (DIF) detection methods are widely used to evaluate the fairness and equality of tests on item level in investigating the comparability of translated and/or adapted measures (Zumbo, 2007). DIF occurs and threatens the comparability of scores if students with the similar ability level on the underlying construct, mathematics ability in this study, in different groups do not have the similar probability of getting the right answers for a specific item (van de Vijver & Leung, 1997; Zumbo, 2007). Evaluating items in terms of DIF is a necessary preliminary analysis before conducting any comparative study. Otherwise, if a test contains DIF items, observed differences in scores could be related to problems based on problematic items rather than true differences in the underlying trait or ability (He & van de Vijver, 2013).
PISA items are prepared very carefully under the guidance of the experts by international team of item developers. Translatability reviews are conducted considering translation, adaptation and cultural issues (OECD, 2017). However, many researchers reported that PISA mathematics items contained DIF items (Demir & Kose, 2014; Kankaras & Moors, 2014; Lyons-Thomas, Sandilands, & Ercikan, 2014; Yildirim & Berberoglu, 2009). Yildirim and Berberoglu (2009) reported that 5 out of 21 mathematics items in PISA 2003 flagged as having DIF in comparison of Turkish and American students (3 of these items favored Turkish students). Lyons-Thomas et. al (2014) found that there were gender DIF in PISA 2009 mathematics items of students in Canada, Finland, Shanghai, and Turkey. Demir and Kose (2009) identified many DIF items in PISA 2009 mathematics assessment when they compare answers of Turkish students with German, Finish and American students. Therefore, there is a possibility that PISA 2015 mathematics items might contain DIF items that might cause a decline in Turkish students’ mathematics scores. There is not any study that investigated whether PISA 2015 items contain DIF items or not in comparison of Turkish and English speaking students.
The research questions guided this study were
(1) Is item bias present in PISA 2015 mathematics items in comparing Turkish and English students?
Is item bias present in PISA 2015 mathematics items in comparing Turkish and American students?