Session Information
09 SES 03 A, Comparing Large-Scale Assessments across Countries and Domains: Issues in Interpretation and Adaptation Procedures
Paper Session
Contribution
Large-scale assessment, in the form of national and international attainment surveys, has been a feature in educational policy making for over 50 years. The International Association for the Evaluation of Educational Achievement (IEA) carried out research-focused cross-border surveys at various points in time through the thirty-year period spanning the 1960s to 1990s. In addition a few countries worldwide, including the USA, the United Kingdom, Canada, France and New Zealand, began to benefit from their own domestic attainment survey programmes, and in consequence built up a sound knowledge of how well their education systems were functioning. In most of the survey programmes concerned test-based information was gathered about student attainment supplemented by questionnaire-based information about students’ learning attitudes, interests and environments.
In the mid-1990s the IEA was able to put its survey work onto a firmer footing than before, with a stronger focus on country comparisons, and regular survey cycles: TIMSS[1] with a 4-year cycle and PIRLS[2] with a 5-year cycle. Item response modelling was simultaneously adopted. In 2000 the OECD launched PISA[3], with its 3-year survey cycle.
With publication of the first PISA report, numerous national governments around the world learned for the first time not only how well (some aspects of) their systems were functioning in general, but how well they were functioning in relation to others that in principle could be expected to be comparable. In the wake of the ensuing phenomenon that we now know as “PISA shock”, the globalising influence of PISA in particular has been strong and irresistible, and one interesting outcome, the main focus of this paper, is the ongoing explosion of national system monitoring activity.
Thus, over the space of a relatively short number of years, the old scenario in which only a handful of countries operated their own, locally targeted, national system evaluations, has been supplanted by a present reality where large numbers of countries, impatient to implement national system evaluations of their own, have also launched, with OECD encouragement, into large-scale domestic assessment programmes.
System monitoring is a challenging activity, however. When the introduction of a national assessment programme is planned, a number of design choices are available and decisions must be made about the purposes, forms and scale of the future system monitoring tool. The choices made are dictated partly by the functions that the programme is expected to serve, partly by expectations of “shelf life” (degree of political commitment to programme continuity in the medium to long term), partly by the assessment expertise of the technical designers, and partly by practical and financial constraints. Decisions concern student sampling, assessment frameworks, task/item design, psychometric models, and so on.
Within and beyond Europe there is a range of national assessment style, as a result of the choices made by stakeholders and technical experts in programme design. The paper briefly overviews the current dynamic situation, and in so doing addresses the following principal research questions:
a) To what extent has the ‘PISA model’ of large-scale assessment been adopted for national system monitoring throughout the world?
b) Is adoption of this particular model appropriate in every national context?
c) What can be anticipated to characterise the future for system monitoring design?
[1] Trends in Mathematics and Science Study
[2] Progress in International Reading Literacy Study
[3] Programme for International Student Assessment
Method
Expected Outcomes
References
Baird, J-A., Isaacs, T. Johnson, S., Stobart, G, Yu, G., Sprague, T. & Daugherty, R. (2011). Policy effects of PISA. Oxford University Centre for Educational Assessment. Eurydice (2009). National testing of pupils in Europe: Objectives, organisation and use of results. (http://www.eurydice.org) Greaney, V. & Kellaghan, T. (2008). Assessing National Achievement Levels in Education. Volume 1. The World Bank. Mons, N. (2009). Theoretical and real effects of standardised assessment. Eurydice Network. OECD, (2001). Knowledge and Skills for Life. First results from PISA. Paris: OECD Peaker, G.F. (1975). An empirical study of education in twenty-one countries: a technical report. New York: Wiley. Ringarp, J. & Rothland, M. (2010). Is the grass always greener? The effect of the PISA results on debates in Sweden and Germany. European Educational Research Journal, 9(3), 422-430.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.