Session Information
16 SES 02 B, Current and Emergent Theoretical and Ethical Perspectives in Research on ICT in K-12 Education and Teacher Education
Paper Session
Contribution
International large scale assessments like PISA (Programme for International Student Assessment) and ICILS (International Computer and Information Literacy Study) are without doubt most known for the so-called league tables, which provide information about the relative abilities of students across countries. But for teachers, teacher educators and developers of teaching material, they provide much more important empirically based knowledge of what characterizes tasks at different difficult levels, and what that tells about students at different ability levels: What can they be expected to do without challenge, what is their present zone of proximal development, and which tasks are they not yet able to perform. This knowledge is summed up in so-called described proficiency scales, which are developed on the basis of analysis of items of similar difficulty and detailed study of tasks at a given difficulty interval (Fraillon et al., 2015; OECD, 2014).
In international comparisons one of the threats against validity is country DIF (Differential Item Functioning), also called item-by-country interaction. Large-scale assessments use Rasch (Rasch, 1960) or Item Response Theory (IRT) as a basis for assessing the properties and quality of the items used for assessing students’ abilities. In IRT, DIF is a measure of how much harder or easier an item is for a respondent of a given group as compared to respondents from other groups of equal ability. If students from one country find a specific item much harder or easier than students from other countries, it can impair the comparison of countries. Therefore great efforts are directed towards analyzing for DIF and removing or changing items that show DIF (e.g. Fraillon et al., 2015, p. 166ff.).
Nonetheless, DIF seems to be unavoidable in large-scale assessments like PISA and ICILS, and this has been the reason for harsh critique, especially towards PISA (Kreiner & Christensen, 2014).
DIF also has a consequence for the described proficiency scales, because if items change difficulty for students of a given country, it means that the international described proficiency scales do not apply to students from this country.
But looking at this phenomenon from another angle, it can be seen not only as a threat to validity, but also as an insight into what distinguish students from different countries, and possibly their education, on a content level.
Therefore, in this paper, the data from ICILS 2013 (Fraillon, Ainley, Schulz, Friedman, & Gebhardt, 2014) is re-analyzed to get a deeper understanding of what students from one country, in casu Denmark, find difficult or easy as opposed to students from other countries.
Thus the research question is: Which kinds of tasks do Danish students find difficult and/or easy in comparison with students of equal abilities from other countries participating in ICILS 2013.
ICILS measures Computer and Information Literacy (CIL) according to this definition: “an individual’s ability to use computers to investigate, create, and communicate in order to participate effectively at home, at school, in the workplace, and in society” (Fraillon, Schulz, & Ainley, 2013, p. 17). ICILS divides CIL into two strands: 1) collecting and managing information, and 2) producing and exchanging information, each consisting of 3-4 aspects: 1.1 Knowing about and understanding computer use, 1.2 Accessing and evaluating information, 1.3 Managing information, 2.1 Transforming information, 2.2 Creating information, 2.3 Sharing information, and 2.4 Using information safely and securely (Fraillon et al., 2013, p. 18).
Method
The Danish assessment dataset from the International Computer and Information Literacy Study (ICILS) 2013 (Fraillon et al., 2014) is re-analyzed using the Rasch model (Rasch, 1960). The Rasch model separates the item difficulties and the person abilities, making it possible to talk about item difficulties independent of the persons talking the test. Using the Rasch model, individual items can be examined for how well they fit the model. Several measures are available for this examination, but in international large-scale assessments it has become standard to use in-fit (also called Weighted Mean Square fit) values in an interval from around .8 to around 1.2 as thresholds for when an item should be considered deleted or partial credit categories collapsed into one (Fraillon et al., 2015, p. 160). In the re-analysis of the Danish data, a single item had an infit value of 1.23, slightly above the threshold. This was seen as an indicator of an overall good fit of the data to the model, and allowed for further analyses using the result of the Rasch analysis. The item difficulties of the re-analysis was compared to the item difficulties reported in the international report (Fraillon et al., 2014). Under the Rasch model, these parameters should have similar values in both analyses. Setting the threshold to a difference of .5 logits, this was not the case for 28 of the 63 items in the dataset (12 items were harder for Danish students, 16 were easier (please note, that Denmark did not meet the sampling requirements in ICILS 2013)). These items were analyzed for similarities in content using the project’s identification of the items in relation to the strands and aspects. The analyses showed that Danish students found items from aspect 1.2. Accessing and evaluating information particularly difficult (4 of 6 items in this aspect were harder for Danish students), while Danish students found items identified as belonging to aspect 1.1. Knowing about and understanding computer use easier than their international peers (5 of 12 items were easier for Danish students).
Expected Outcomes
To Danish students as compared to their international peers, items related to knowing about and understanding computer use are relatively easier than items related to accessing and evaluating information. This result is rather surprising and alarming. Searching the Internet has been an integral part of the teaching and learning standards for several years (Undervisningsministeriet, 2009), and the use of computers for research has been promoted for decades (Bundsgaard, Pettersson, & Puck, 2014, p. 11ff.). If Danish students are struggling with assessing and evaluating information, they will face problems both in further education, as citizens and at the workplace. In the Danish national report on ICILS 2013 (Bundsgaard et al., 2014), it was concluded that Danish students are struggling with the more advanced, critical aspects of Computer and Information Literacy. These results support that conclusion, but expands it. This paper shows that essential results can be identified by comparing distribution of difficulties of items in international large-scale assessments. This is a more constructive approach to the challenge of DIF, but it doesn’t eliminate the serious threat to validity of comparison of countries. One explanation for the DIF could be that the CIL construct in effect is in fact two constructs, related to the two strands, collecting and managing information, and producing and exchanging information. This hypothesis doesn’t seem to be supported, though, given that a very large correlation of 0.96 between the two strands was found (Fraillon et al., 2014, p. 73), and second because most of the items with highly different difficulties come from the first strand.
References
Bundsgaard, J., Pettersson, M., & Puck, M. R. (2014). Digitale kompetencer. It i danske skoler i et internationalt perspektiv. Aarhus: Aarhus Universitetsforlag. Fraillon, J., Ainley, J., Schulz, W., Friedman, T., & Gebhardt, E. (2014). Preparing for Life in a Digital Age. The IEA International Computer and Information Literacy Study International Report. Cham: Springer. Fraillon, J., Schulz, W., & Ainley, J. (2013). International Computer and Information Literacy Study: Assessment Framework. Retrieved from http://ifs-dortmund.de/assets/files/icils2013/ICILS_2013_Framework.pdf Fraillon, J., Schulz, W., Friedman, T., Ainley, J., Gebhardt, E., Ainley, J., … International Association for the Evaluation of Educational Achievement (IEA). (2015). ICILS 2013: technical report. Kreiner, S., & Christensen, K. B. (2014). Analyses of Model Fit and Robustness. A New Look at the PISA Scaling Model Underlying Ranking of Countries According to Reading Literacy. Psychometrika, 79(2), 210–231. https://doi.org/10.1007/s11336-013-9347-z OECD. (2014). PISA 2012 - Technical Report. Paris: OECD. Retrieved from http://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danmarks pædagogiske Institut. Undervisningsministeriet. (2009). Fælles mål 2009 - Dansk. Fælles Mål. Retrieved from http://www.uvm.dk/Service/Publikationer/Publikationer/Folkeskolen/2009/~/media/Publikationer/2009/Folke/Faelles%20Maal/Filer/Faghaefter/120326%20Faelles%20maal%202009%20dansk%2025.ashx
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.