Cross-cultural Comparability of Scales on Adaptive Teaching and Differentiation: Evidence from Teaching and Learning International Survey 2018

Author(s):

Agnes Stancel-Piatak(presenting / submitting)Katrin Schulz-Heidorf(presenting)

Conference:

ECER 2017

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 12 A, Teaching and Teacher Characteristics: Findings form large scale assessments

Paper Session

Time:

2017-08-25

09:00-10:30

Room:

W3.11

Chair:

Trude Nilsen

Contribution

Teaching and Learning International Survey (TALIS) is the first international large scale survey with a specific focus on the learning environment and working conditions of teachers in lower secondary schools. TALIS collects trend data covering educational key issues every five years using a cross-sectional design (IEA, 2017). In this study we use the Multiple Group Confirmatory Factor Analysis to test the cross-country measurement invariance of scales on differentiation and adaptive teaching in heterogeneous classrooms for the education systems participating in TALIS 2018. The presentation provides an overview over the methodological procedures and the resulting content related implications. Recommendations for further research are derived from the analysis.

Adapting teaching to the needs of the learners (e.g. by differentiating groups of students by learning levels or by adapting instruction, feedback and material to the individual learner) has been found to play an essential role in the advance of learning processes (Roth, 2009). Successful learning depends on the degree to which the subject taught is of relevance to the individual student and whether it can be linked to the existing knowledge. Differentiated, adapted settings that allow students to learn according to their interests and competencies applying adequate learning techniques might enhance student performance, especially in heterogeneous learning settings and could reduce effects of social origin on achievement (Schulz-Heidorf, 2016). Although there has been some attempt to implement adaptive teaching and differentiation in education systems some countries are still struggling with challenges related to the successful implementation of respective methods into the daily practice of school teachers. For example in Germany the claim for differentiated and adapted teaching has been implemented only recently and teachers reported great difficulties with its realization in class (Solzbacher, 2008). An international comparison on the use of such teaching strategies allows to identify countries that have a stronger implementation of respective didactical settings along with further in-depth analyses. Recommendations on how to enhance student learning can be derived from such comparisons. This is in particular of education policy interest as it would provide a prospect of its potentials in reducing social disparities.

One major prerequisite to conduct international comparisons is the availability of a measure with specific quality properties concerning the cross-country comparability (Meinck, Stancel-Piątak, Hastedt, & Sibberns, in press). The construct validity should be judged among others with respect to the measurement invariance across groups of comparison (Meredith, 1993; Rutkowski & Svetina, 2013). As previous analyses have shown (Schulz-Heidorf & Solheim, 2016), there is no common understanding of the concept of adapted teaching, which can lead to difficulties in comparing measures collected in different countries. In this study scales on adapted teaching and differentiation from Teaching and Learning International Survey field trial are analyzed with respect to their cross-cultural measurement invariance. In prior TALIS cycles (2008 and 2013) the scales were constructed using the linear measurement model. In this cycle categorical measurement model is used to establish partial measurement invariance (Elosua, 2011).

Method

In this research project data from more than 45 countries from TALIS 2018 field trial is used to analyze the measurement invariance of complex scales on differentiation and adaptive teaching in heterogeneous classrooms. The test will apply the categorical MG-CFA accordingly to the scale of the items (Elosua, 2011). According to the degree of how strict the latent constructs are specified the measurement invariance is classified into four levels: the configural level assuming equal factor structure across groups, the metric level assuming additionally equal factor loadings, the scalar level assuming additionally equal means and the strict level assuming also equivalent residuals. To conduct comparisons across groups at least the scalar level of measurement invariance is required, meaning that the latent construct has the same meaning to survey respondents across the groups (Horn & McArdle, 1992). If the measurement invariance can be established the factor scores created from the complex scales allow for cross-country comparisons. However, if an identical measurement model cannot be established, the alternative would be establishing partial invariance (Byrne, Shavelson, & Muthén, 1989). The traditional measurement invariance testing is based on the idea of “absolute invariance” where small deviations of the model lead to model rejections. This procedure tests the assumption that the latent construct is identical in all countries. Absolute measurement invariance can be achieved for smaller numbers of countries (Stancel-Piątak & Desa, 2014). However, according Meinck et al (in press) this assumption could be criticized as unrealistic and overstated if many culturally diverse countries are compared. Accordingly studies experience difficulties when aiming at establishing the absolute invariance (OECD, 2014; Schulz, 2009). When looking at cross-country comparisons the overall question concerns the measurement accuracy expected to assume the model to be valid for cross-country comparisons. In other words the question is how precise the comparison have to be in order to allow for useful practical conclusions. If the existence of an identical measurement models is questionable, the alternative would be establishing partial invariance. The assumption is that the latent construct is expected to be similar across countries, but some differences in item parameters can be considered in the model (Byrne et al., 1989). Partial invariance allows selected parameters to vary across selected groups. In this study partial invariance is applied to compute a cross-country comparable measurement instrument. However, depending on the amount of flexibility in terms of parameter specification the comparability of scales might be limited.

Expected Outcomes

We expect that scales on adaptive teaching will show higher levels of invariance across countries than those on general differentiation and forms of feedback. There are currently very heterogeneous definitions and understandings of adapted teaching not only on policy levels but also among teachers, which might result in the lack of measurement invariance across countries. In contrast, there is a greater consensus on general forms of differentiation and feedback making them less likely to be subject to individual interpretations. Hence, these scales might show higher measurement conformity across countries. The scale validation method chosen for the first two rounds of TALIS (2008 and 2013) was CFA using the linear measurement model. The measurement invariance testing has shown that the scales cannot be assumed as comparable across countries (Vieluf, 2010). However, using the categorical measurement model the latent traits have shown higher levels of invariance in prior research (Rutkowski & Svetina, 2013). Based on that we expect the categorical measurement model to produce higher level of invariance. Facing difficulties in establishing partial invariance an alternative methodological approach could to be considered in future research, the approximate invariance (alignment modeling) (Asparouhov & Muthén, 2014). The underlying assumption is that the latent construct is expected to be very similar for all the countries, but not identical. Methodologically, approximate invariance allows all parameters (factors loadings, means and residuals) to vary between all groups but the degree to which the parameters can vary is limited by a range. The method based on this assumption has been already incorporated into some statistical software packages (e.g. Mplus) using the Bayesian framework (alignment modeling, Asparouhov & Muthén, 2014).

References

Asparouhov, T., & Muthén, B. (2014). Multiple-Group Factor Analysis Alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495–508. Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456–466. Elosua, P. (2011). Assessing Measurement Equivalence in Ordered-Categorical Data. Psicologica: International Journal of Methodology and Experimental Psychology, 32(2), 403–421. Fischer, Christian (2014). Individuelle Förderung als schulische Herausforderung. Berlin: Friedrich-Ebert-Stiftung. Horn, J. L., & McArdle, J. J. (1992). A practical and theoretical guide to measurement invariance in aging research, 18(3–4), 117–144. IEA. (2017). Working in Partnership | IEA. Retrieved January 21, 2017. Meinck, S., Stancel-Piątak, A., Hastedt, D., & Sibberns, H. (in press). Cross-national large-scale assessments in education: methodological challenges, perspectives, and developments. In K. Schulz-Heidorf & J. Gerick (eds.) Current perspectives and developments in large scale assessments. Tertium Comparationis 23 (1) [special issue]. Münster: Waxmann. Meredith, W. (1993). Measurement Invariance, Factor Analysis and Factorial Invariance. Psychometrika, 58(4), 525–543. OECD. (2014). TALIS 2013 Technical Report. Paris: OECD. Roth, G. (2009). Warum sind Lehren und Lernen so schwierig? In Ulrich Herrmann (Hrsg.), Neurodidaktik. Grundlagen und Vorschläge für gehirngerechtes Lehren und Lernen (2. Aufl., S. 58–68). Weinheim: Beltz. Rutkowski, L., & Svetina, D. (2013). Assessing the Hypothesis of Measurement Invariance in the Context of Large-Scale International Surveys. Educational and Psychological Measurement. Schulz, W. (2009). Questionnaire construct validation in the international civic and citizenship education study. IERI Monograph Series: Issues and Methodologies In Large-Scale Assessments, (2), 113–136. Schulz-Heidorf, Katrin (2016). Individuelle Förderung im Unterricht: Eine Möglichkeit, soziale Herkunft und Schulerfolg zu entkoppeln? Eine Re-Analyse aus IGLU-E 2011. Berlin: epubli. Schulz-Heidorf, Katrin & Solheim, Oddny Judith (2016). Adapted teaching: A chance to reduce the effect of social origin? A comparison between Germany and Norway, using PIRLS 2011. Tertium Comparationis, 22 (2), 230-259. Solzbacher, C. (2008). Was denken Lehrerinnen und Lehrer über individuelle Förderung? Pädagogik, 60, 38-42. Stancel-Piątak, A., & Desa, D. (2014). Methodological Implementation of Multi Group Multilevel SEM with PIRLS 2011: Improving Reading Achievement. In R. Strietholt, W. Bos, J.-E. Gustafsson, & M. Rosén (Eds.), Educational Policy Evaluation through International Comparative Assessments (pp. 75–93). Münster: Waxmann. Vieluf, S. (2010). Construction and validation of scales and indices. In TALIS 2008 Technical Report (pp. 131–206). Paris: OECD.

Author Information

Agnes Stancel-Piatak (presenting / submitting)

IEA-DPC

Research and Analysis Unit

Hamburg

Katrin Schulz-Heidorf (presenting)

University of Hamburg

Evaluation of Educational Systems