Session Information
09 ONLINE 30 B, Relating Individual Non-cognitive Factors to Student Achievement
Paper Session
MeetingID: 837 6293 3146 Code: A9Xnve
Contribution
Student surveys are commonly used to acquire student perceptions of teaching quality and, subsequently, examine its association with student learning outcomes (Seidel & Shavelson, 2007). However, most studies using students’ perceptions to examine teaching quality are limited to one specific, mostly western, country/setting (e.g., Senden et al. in press; Wisnieuwski et al. 2021), while few studies have examined student perceptions of teaching quality cross-nationally and have done so primarily across a small number of countries (e.g., Scherer et al., 2016). Single-country studies are essential as they can provide valuable insights on student perceptions of teaching quality and their relation to student outcomes. However, it is questionable whether findings from single-country studies can be used to make valid inferences about cross-national differences and similarities (van de Vijver, 2018).
International Large-scale Assessments (ILSA’s) such as the Trends in Mathematics and Science Study (TIMSS) provide a unique opportunity for cross-national comparisons of student perceptions of teaching quality as they (1) assess multiple aspects of teaching quality; (2) assess those aspects across a considerable number of world regions, educational systems and cultures across the world (hereinafter referred to as “countries”); (3) use weighting procedure to ensure national representative samples; (4) use identical items to assess constructs across all countries and; (5) provide extensive quality control procedures. (Blömeke et al., 2016; van de Vijver, 2018).
However, for meaningful cross-national comparisons, the underlying measurement structure of the constructs should be stable, that is, measurement invariant (Davidov et al., 2018). A common approach to assess measurement invariance is to conduct a multi-group confirmatory factor analysis (MGCFA), followed by systematically restricting parameters in the model to be invariant. Comparison of regression coefficients requires invariant factor loadings, referred to as metric invariance. The comparison of latent means requires the more stringent criteria of both invariant factor loadings and intercepts, referred to as scalar invariance (Millsap, 2011). When conducting a MGCFA across many groups (e.g. countries) with the goal of comparing latent means, the requirements of exact invariance is an almost unattainable ideal (Asparouhov & Muthén, 2014; Marsh et al., 2018). Applied research has consistently run into this problem when comparing teaching quality across a multitude of countries, and therefore mostly refrained from mean-score comparisons (Blömeke et al., 2016; Nilsen et al., 2016). With this severe limitation in mind, Asparouhov and Muthén (2014) have recommended another approach to measurement invariance for many groups, namely alignment optimization. This approach applies less strict constraints than the traditional MGCFA and can be used to estimate trustworthy group-specific factor means and variances (Asparouhov & Muthén, 2014; Davidov et al., 2018). The usefulness of this approach has been shown in several recent studies, including studies using International Large-scale Assessment data such as PISA or TIMSS (e.g. Glassow et al., 2021; Marsh et al., 2018; Odell et al., 2021).
Against this background, this study aims to investigate the comparability of student perceptions of two aspects of teaching quality across 38 countries who participated with the eight grade in the Trends in International Mathematics and Science Study (TIMSS) 2019. In addition, the study aims to do a subsequent cross-national mean-score comparison of the two aspects of teaching quality followed by a cross-national evaluation of the impact of teachers’ gender on how students’ rate teaching quality.
Research questions:
- To what extent are student perceptions of teaching quality measurement invariant across a diverse set of countries?
- To what extent are there significant mean-score differences of student perceived teaching quality across countries?
- To what extent does teachers’ gender influence student ratings of teaching quality?
Method
We used data from all countries, except one, that participated with eight-grade in TIMSS 2019. Singapore was excluded due to non-participation in one of the scales measuring teaching quality, leading to a total of 38 countries used in this study. TIMSS 2019 assessed two aspects of teaching quality: disorderly behaviour in the classroom and instructional clarity. Using confirmatory factor analyses (CFA), we investigated the two-dimensional factor structure in every country and in the pooled sample, while accounting for non-normality and non-independence of the data by using the TYPE=COMPLEX function and the MLR estimator. In a next step, we conducted tradition measurement invariance analyses across countries using the traditional MGCFA approach with exact invariance constraints. We started with by assessing whether the factorial structure could be applied across all 38 countries (configural invariance), after which we systematically restricted parameters to be invariant, starting with factor loadings (metric invariance), followed by intercepts (scalar invariance). Models were compared regarding change in goodness-of-fit indices. In cases were scalar, or even metric, invariance did not hold, we continued with the Alignment method. The alignment method minimizes the total amount of non-invariance across groups and extracts trustworthy means and variance from the data by accounting for the non-invariance of parameters (Asparouhov & Muthén, 2014; Cieciuch et al., 2018). In addition, the output of the alignment analysis presents an overview of (non-)invariant factor loadings and intercepts for each item in every country, which can be used to calculate the percentage of non-invariant parameters (Muthén & Asparouhov, 2018). Muthén and Asparouhov (2014) propose a limit of 25% non-invariance as a rough rule of thumb to acquire trustworthy means and recommend a Monte Carlo simulation study with higher percentages. Finally, to investigate whether teachers’ gender affects students’ rating of teaching quality, we extended the alignment model to an Alignment Structural Equation Model (AESEM). This allowed us to extent our analysis to a Multiple Indicators Multiple Cause (MIMIC) model by including our three covariates: Teachers’ gender, educational level, and years of experience as predictors of student ratings of teaching quality. Our main interest with this model was the impact of teachers’ gender, whereas the other covariates were primarily introduced as controls.
Expected Outcomes
A two-factor model, including both the disorderly behavior and instructional clarity scale, showed a good fit to the data. The traditional multigroup CFA with exact measurement invariance constraint showed acceptable changes in model fit indices between the configural and metric model, but not between the metric and scalar model. The subsequent Alignment procedure showed that disorderly behaviour had, on average, 50.2% non-invariant parameters, whereas instructional clarity had, on average, 32% non-invariant parameters. The high-amount of non-invariance, especially of the disorderly behaviour scale, prompted us to continue with a monte-carlo simulation. The median sample size of our 38-country sample was 4500 students’, and all countries had sample sizes higher than 3000 students’. Therefore, we conducted 100 replications with sample sizes of 3000, 4000, 5000, and 6000. Simulated correlations between the estimated and generated data were sufficiently high across all sample sizes and above the recommended cut-off of 0.98 proposed by Muthén and Asparouhov (2018). Thus, we cautiously continued with a latent mean score comparison. Finally, the effect of teachers’ gender on student ratings showed that students in Japan, Qatar, Roumania, and England rated their male teachers as dealing significantly better with disorderly and disruptive behaviour in the classroom. On the other hand, students’ in Bahrain, Hong Kong, Iran, and United Arab Emirates rated their female teachers’ as dealing significantly better with disorderly and disruptive behaviour. Overall, effect sizes across countries are small and non-significant. The average effect sizes concerning instructional clarity across countries indicated that students’ systematically rate their female teachers as providing more instructional clarity, with effect sizes being significant in six countries. There were no countries in which students’ rated their male teachers’ as providing significantly higher instructional clarity.
References
André, S., Maulana, R., Helms-Lorenz, M., Telli, S., Chun, S., Fernández-García, C.-M., de Jager, T., Irnidayanti, Y., Inda-Caro, M., Lee, O., Safrina, R., Coetzee, T., & Jeon, M. (2020). Student Perceptions in Measuring Teaching Behavior Across Six Countries: A Multi-Group Confirmatory Factor Analysis Approach to Measurement Invariance [Original Research]. Frontiers in Psychology, 11. https://doi.org/10.3389/fpsyg.2020.00273 Asparouhov, T., & Muthén, B. (2014). Multiple-Group Factor Analysis Alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495-508. https://doi.org/10.1080/10705511.2014.919210 Avvisati, F., Le Donné, N., & Paccagnella, M. (2019). A meeting report: cross-cultural comparability of questionnaire measures in large-scale international surveys. Measurement Instruments for the Social Sciences, 1(8), 1-10. https://doi.org/10.1186/s42409-019-0010-z Davidov, E., Muthen, B., & Schmidt, P. (2018). Measurement Invariance in Cross-National Studies:Challenging Traditional Approaches and Evaluating New Ones. Sociological Methods & Research, 47(4), 631-636. https://doi.org/10.1177/0049124118789708 Marsh, H. W., Guo, J., Parker, P. D., Nagengast, B., Asparouhov, T., Muthén, B., & Dicke, T. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23(3), 524-545. https://doi.org/10.1037/met0000113 Nilsen, T., Gustafsson, J.-E., & Blömeke, S. (2016). Conceptual Framework and Methodology of This Report. In T. Nilsen & J.-E. Gustafsson (Eds.), Teacher Quality, Instructional Quality and Student Outcomes: Relationships Across Countries, Cohorts and Time (pp. 1-19). Springer Publishing & IEA. https://doi.org/10.1007/978-3-319-41252-8 Scherer, R., Nilsen, T., & Jansen, M. (2016). Evaluating Individual Students’ Perceptions of Instructional Quality: An Investigation of their Factor Structure, Measurement Invariance, and Relations to Educational Outcomes. Frontiers in Psychology, 7. https://doi.org/10.3389/fpsyg.2016.00110 Seidel, T., & Shavelson, R. J. (2007). Teaching Effectiveness Research in the Past Decade: The Role of Theory and Research Design in Disentangling Meta-Analysis Results. Review of Educational Research, 77(4), 454-499. https://doi.org/10.3102/0034654307310317 van de Vijver, F. J. R. (2018). Towards an Integrated Framework of Bias in Noncognitive Assessment in International Large-Scale Studies: Challenges and Prospects. Educational Measurement: Issues and Practice, 37(4), 49-56. https://doi.org/https://doi.org/10.1111/emip.12227 Wisniewski, B., Zierer, K., Dresel, M., & Daumiller, M. (2020). Obtaining secondary students’ perceptions of teaching quality: Two-level structure and measurement invariance. Learning and Instruction, 66. https://doi.org/10.1016/j.learninstruc.2020.101303
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.