Development and adaptation of PISA tests for use in regional large-scale assessments

Author(s):

Eva Expósito Casas(presenting / submitting)Enrique Navarro Asencio(presenting)Ángeles Blanco Blanco Covadonga Ruiz de Miguel

Conference:

ECER 2012

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 05 C, National and Regional Large-scale Assessments: Methods and Findings

Parallel Paper Session

Time:

2012-09-19

11:00-12:30

Room:

FCT - Seminario 2

Chair:

Rolf Strietholt

Contribution

Introduction
The annual assessment of pupil´s learning outcomes in compulsory education is regulated by Spanish legislation both at national and regional levels. The different regions in the use of their competences are in charge of the development, dissemination, application and correction of tests given to students in their respective schools. In this context, this paper describes the features of the student’s assessment in the areaof Madrid called “Evaluación de diagnóstico”, which has been designed to allow for international comparability of the performance obtained by students in several matters that are usually included in the large-scale assessments.
The distinguishing feature of this assessment is in the development and application of mathematical and reading comprehension tests with educational and psychometric characteristics contrasted and the establishment of a standard for the international comparison with the scale set out in the Programme for International Student Achievement (PISA). These tests have been adapted and validated for usewithSpanish population and are named ESP-ISA tests.
The ESP-ISA tests are a contextualized adaptation of the assessment tests used in the program International Schools’ Assessment - ISA (designed and implemented by the Australian Council for Educational Research in 2001) to the Spanish educational system. The ISA program is based on the PISA program.
The design of the assessment and the ESP-ISA tests used, built from ISA items data base and related to PISA scale (because released PISA items are included in the test), ensures that the results can be compared with PISA, as well as allowing all the Madrid results can be equally expressed in that scale, since the sample of students that answer the ESP-ISA tests also complete the tests applied to the general population.
This paper aims to describe the tests building process, which can be divided into three major phases: a) itemstranslation to Spanish language, b) pilot study, c) itemselection and preparation of the final test in a PISA way. The specific features of the items and PISA test type are factors that determine the type of psychometric analysis to be used for judge theiradequacy and working. Aspects such as inter-rater reliability, the correlation point-biserial, the itemdifficulty or fit rateprovided for theIRT partial creditanalysis, have been considered for make the final assessment tests.

Method

METHODOLOGY a. Translation into Spanish. The translation was subjected to a linguistic quality control (LQC) by the Belgian company cApStAn. The LQC is the validation of translated documents to ensure that they meet the requirements of linguistic correctness, accurate and consistent terminology, cultural appropriateness and readability. Were designed two forms of each test for each skill tested (reading/math). The pilot study was conducted on a sample of 323 students in fourth grade of primary education and 352 second grade student in secondary education. b. Training markers. c. Items analysis in the pilot study IRT models adjusted to the nature of the items (multiple choice with dummy coding and constructed-response) were used. We employedemployed a parameter logistic model (Rasch, 1960) and partial credit model (Masters, 1982), which allows the analysis of cognitive and attitudinal items that can take two or more levels. The IRT and also classical theory of test (CTT) parameter estimation was carried out with ACER ConQuest software (Wu and Adams, 2007) The parameters (IRT and CTT) of the items used to carry out the selection are as follows: - Difficulty index (parameter b IRT). - Model fit: Infit and Outfit indexes - Item Discrimination: correlation point-biserial - Inter-rater reliability

Expected Outcomes

From the results of the analysis described above, were developed the final tests ruling out certain items that do not meet the psychometric requirements set up in advance. The total number of items removed was about 40% and a test for each subject and course was developed with a 30 itemslength. The buildingprocess of this kind of test is highly complex, the methodological rigor in the design and theadequacy and depth of the data analysis presented are factors that assure the psychometric quality of the final tests, constituting a key element in the design of the large scale assessments.

References

Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACERConQuest Version 2.0: generalized item response modeling software. Victoria: Australian Council for Educational Research - ACER Press. Masters, G. N. (1982). A RaschModel for Partial Credit Scoring. Psychometrika, 47 (2), 149-174. Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen: DanishInstituteforEducationalResearch. Tristán, A. (2001). Análisis de Raschparatodos. México: CENEVAL.

Author Information

Eva Expósito Casas (presenting / submitting)

Universidad Nacional de Educación a Distancia

Métodos de Investigación y Diagnóstico en Educación

Madrid

Enrique Navarro Asencio (presenting)

Universidad Complutense de Madrid, Spain

Ángeles Blanco Blanco

Universidad Complutense de Madrid, Spain

Covadonga Ruiz de Miguel

Universidad Complutense de Madrid, Spain

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.

Session Information

Contribution

Method

Expected Outcomes

References

Author Information

Search the ECER Programme

Navigation

Info for