Session Information
09 ONLINE 23 B, Use of LSA Data for National Evaluation Puposes
Paper Session
MeetingID: 892 2675 8259
Code: 746294
Contribution
At the national level, school grades and national examinations are widely accepted and used measures of student achievement. However, international large-scale assessments (ILSAs) have also gained an increasingly important role in establishing facts about education and in decision-making and reforms at different levels in society, both nationally and internationally (e.g., Grek, 2009; Lindblad, Pettersson, & Popkewitz, 2018). One of the most prominent ILSAs is the PISA study, implemented by the Organisation for Economic Co-operation and Development (OECD). PISA is conducted every three years and measures 15-year-olds' competence in three core domains: reading, mathematics and science. Considering the impact large-scale comparative studies like PISA have on educational debate and policy, the results need to be valid. Moreover, if the relationship between ILSAs and national school achievement measures is found to be strong, the results from international assessments will bear greater relevance in relation to school development and educational policy and practice at the national level. To what extent PISA scores are consistent with established and legitimate measures of achievement, however, is an open question. In the 2018 Swedish PISA test, students’ personal identification numbers were collected for the first time, making it possible to connect PISA test scores with data on the same students’ national test results and final grades, provided by Statistics Sweden (SCB). This study utilizes this unique opportunity to analyze PISA data combined with register data on school achievement measures.
Whereas some ILSAs have a curriculum approach (e.g., TIMSS “Trends in International Mathematics and Science Study”), content in PISA is not specifically based on participating countries’ school curricula. PISA focuses on whether 15-year-olds can apply the knowledge they have developed in school in real life situations. In spite of these conceptual differences, several studies indicate that there is a close alignment between the PISA framework and the Swedish curricula documents (Frändberg & Hagman, 2017; Johansson, Klapp, & Rosén, 2019; Sollerman, 2019). Few studies have linked PISA results to data on grades and national test results. One example is a study from Finland, which used students’ self-reported grades and found a fairly high agreement with their PISA-results (Harju-Luukkainen et al., 2016). In Denmark, Mejding, Reusch, & Yung Andersen (2006) combined students’ leaving examination marks for some 80% of the students who participated in PISA 2003 and found moderate correlations to their PISA scores (r=.3-.4). Similarly, in Sweden, fairly high positive correlations between students’ TIMSS mathematics achievement, and moderately high positive correlations between students’ TIMSS science achievement, and final grades and national test scores have also been found (r=.5-.7; Wiberg, 2019; Wiberg & Rolfsman, 2019). While there are a few studies on how mathematics and science results in ILSA relate to national assessments, no study has so far been conducted focusing on reading.
The current study sheds light on the connection between the knowledge and skills measured in PISA and the knowledge and skills assessed by teachers in the end of compulsory school. The main aim is to investigate the association between students’ grades and national test results in compulsory school in Sweden and PISA scores by analyzing the degree of correspondence among the three assessments. More specifically, we examine the dimensionality of PISA, national test, and grades to determine if PISA measures the same or different competencies than the commonly accepted measures of achievement in Sweden.
Method
To investigate the alignment of PISA with national measures of student achievement, we combined PISA data with results from the Swedish National Tests and grades. A unique feature of Sweden’s participation in PISA 2018 is that the Swedish National Agency for Education has recorded students’ social security numbers. This information was used to combine students’ PISA results with register data from Statistics Sweden on their national test results and final grades from school year nine. In PISA 2018, achievement data is available for a sample of 5504 Swedish students. In this study, we analyzed a total of nine performance measures taken from PISA, national tests, and grades. From each of these sources, achievement scores were available for the three domains of reading, mathematics, and science. The ten plausible values generated for each domain were used as measures of PISA achievement, in line with recommendations in von Davier et al. (2009). Both the current national tests and the grading system in Sweden are criterion-referenced and use a six-point grading scale, A–F, where A is the highest grade and F is fail. The letter grading scale was transformed into the numeric scale 5–0. We used grades and national test results for the subject domains mathematics, Swedish, and a composite measure for science using the sum of the natural science subjects Physics, Chemistry and Biology, to correspond to the PISA measures. The dimensionality of the different measures was studied by modeling and comparing a set bi-factor (S-1) models with a general factor along with nested domain- and test specific factors (Eid, Geiser, Koch, & Heene, 2017). In the baseline model we specified one common factor for all nine achievement measures, and, thereafter, we added stepwise orthogonal specific factors for the subject domains mathematics and Swedish using Science as a reference category, as well as for PISA and National Tests using grades as a reference category. The models were estimated in Mplus statistical software, using maximum likelihood estimation with robust standard errors (MLR), and the full-information maximum likelihood method to handle missing data (Dong & Peng, 2013). Student weights were applied to account for the fact that they had different sampling probabilities. The hierarchical structure of the PISA sample was accounted for by using the clustered data option in Mplus. To evaluate model fit, we compared the commonly used model fit indices Chi2, RMSEA, SRMR, CFI, and TLI (see Hu & Bentler, 1999).
Expected Outcomes
Descriptive analyses show a clear correspondence between students’ grades and national test results and their PISA scores. However, there was variation in the size of the correlations between the nine achievement measures. We observed positive correlations between all performance measures of the national tests and grades (about .6-.8), with the correlation being stronger within subject domains (> .8). We did not observe this pattern when comparing PISA with national tests and grades, indicating that the PISA test does not differentiate between the three domains in the way that national tests and grades do. In further analyses, we studied the dimensionality of the measures using factor analytical models. The model with one common factor had poor model fit which improved consistently after adding nested domain- and test specific factors. This finding provides further evidence that the different performance measures capture different facets of student achievement. Consistently high factor loadings on the general factor were found for all nine measures (>.7), indicating a general ability observed in all assessments and domains. However, more interestingly, the three achievement scores from PISA also have high loadings on the PISA factor (all .6), whereas the three scores from the National Test have comparatively low loadings on the factor for the National Test (all < .2). Thus, PISA measures something different from school grades, whereas national tests and school grades hardly differ. For the two domain-specific factors, we observe low loadings for the PISA scores (.1), but significantly higher loadings for the National test measures (around .5) and grades (around .5). In summary, our empirical analyses show that PISA does not measure the same subject-specific competencies as the performance measures in Sweden. In the presentation, we will discuss possible reasons for these findings and the implications for the use of PISA scores in the educational discourse.
References
Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. SpringerPlus, 2, 222. http://dx.doi.org/10.1186/2193-1801-2-222 Eid, M., Geiser, C., Koch, T., & Heene, M. (2017). Anomalous results in G-factor models: Explanations and alternatives. Psychological Methods, 22(3), 541–562. https://doi.org/10.1037/met0000083 Frändberg, B, & Hagman, M. (2017). Med fokus på naturorienterande ämnen – En analys av samstämmighet mellan svenska styrdokument i NO och de internationella studierna TIMSS 2015 och PISA 2015. [Focus on the natural science subjects. An analysis of the agreement between Swedish curricula in science and the intenational assessments TIMSS 2015 and PISA 2015]. Stockholm: Skolverket Grek, S. (2009). Governing by numbers: the PISA ‘effect’ in Europe. Journal of Education Policy, 24(1), 23-37. https://doi.org/10.1080/02680930802412669 Harju-Luukkainen, H., Vettenranta, J., Ouakrim-Soivio, N., & Bernelius, V. (2016). Differences between students’ PISA reading literacy scores and grading for mother tongue and literature at school: A geostatistical analysis of the Finnish PISA 2009 data. Education Inquiry, 7(4), 463-479. https://doi.org/10.3402/edui.v7.29413 Hu, L.-t., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. https://doi.org/10.1080/10705519909540118 Johansson, S., Klapp, A., & Rosén, M. (2019). Läsförståelse i PISA 2018 - Om relationen mellan läsförståelseuppgifterna i PISA och den svenska kursplanen. [Reading literacy in PISA 2018: On the relation between the reading items in PISA and the Swedish syllabi]. Stockholm: Skolverket. Lindblad, S., Pettersson D., & Popkewitz, T.S. (2018). Numbers, Education and the Making of Society: International Assessments and Its Expertise. London: Routledge Sollerman, S. (2019). Kan man räkna med PISA och TIMSS?: Relevansen hos internationella storskaliga mätningar i matematik i en nationell kontext. [Can one count with PISA and TIMSS? The relevance of international large-scale assessments of mathematics in a national context]. (PhD diss). Stockholms universitet. von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful? In M. von Davier and D. Hastedt (Eds.), IERI monograph series: Issues and methodologies in large scale assessments (vol. 2). IEA-ETS Research Institute. Wiberg, M. (2019). The relationship between TIMSS mathematics achievements, grades, and national test scores. Education Inquiry, 10(4), 328-343. https://doi.org/10.1080/20004508.2019.1579626 Wiberg, M., & Rolfsman, E. (2019). The association between science achievement measures in schools and TIMSS science achievements in Sweden. International Journal of Science Education, 41(16), 2218-2232. doi:10.1080/09500693.2019.1666217
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.