Session Information
Contribution
Theoretical background
This project was grounded on a theoretical approach to defining teacher effectiveness in generic teaching behaviours, presumably observable in the classroom. This observational approach contrasts with a different definition of teacher effectiveness based on value-added models comparing the annual gains of student test scores. While classroom observation research on teacher effectiveness and teacher evaluation share a common goal to assess and improve the quality teaching to enhance learning outcomes, the need for an internationally validated instrument for different countries has been pressing for international comparative research (Teddlie et al., 2006). For classroom observation research, there are two issues concerning measurement invariance.
First, one issue concerns whether an observation instrument is applicable for measuring teaching behaviours across cultural contexts poses a real challenge for researchers because the accuracy and sensitivity of the measure may vary across national contexts (Maulana et al., 2020). Different instruments may look similar, but the data obtained from these observation instruments are not readily comparable if the scoring and scales are different. Unfortunately, it is also rare that studies compare the same lessons with different classroom observation instruments to examine whether classroom characteristics in different instruments are similar (for exceptions see Ko, 2010; Ko & Li, 2020; Ko, Fong, Y. & Xie, Q., 2019; Ko, Sammons, Maulana, Li & Kyriakides, 2019; Kington et al., 2014).
This project addressed measurement invariance across instruments by comparing the observational results of the Classroom Assessment Scoring System (CLASS) (Pianta et al., 2008), the key instrument used in the Measures of Effective Teaching (MET) project (Kane, 2013) with observational results with two new instruments: the International Comparative Analysis of Learning and Teaching (ICALT) (Maulana et al., 2020) and the Comparisons on Effective Teaching and Inspiring Teaching (CETIT) (Ko & Li, 2020; Ko et al., 2019). CLASS is the most widely used observational tool in the USA with some ecological validations evident outside the USA (e.g., Virtanen et al., 2018; Westergard, Ertesvag & Rafaelsen, 2019). ICALT was an internationally validated instrument developed for exploring the generic teaching characteristics in different countries, while CETIT was developed purposively to compare effective and inspiring teaching.
By comparing different instruments with the same lesson videos, we can see measurement invariance across instruments: what aspects of teaching in different instruments are more closely related. For example, Positive Climate in CLASS, Safe and stimulating learning climate in ICALT and Safe classroom climate in CETIT are more likely to be similar. In contrast, as Flexibility and Teacher Reflectiveness in CETIT were found clustering differently with ICALT factors as they were factors theoretically associated with inspiring teaching, we expected they would be different from CLASS factors.
We also consider that factors of a classroom observation instrument should predict learner engagement or student learning outcomes. Having data from different instruments of the same lesson videos would allow us to identify their relative predictability at both scale and subscale levels.
Research Questions
Thus, the study had two research questions: 1) To what extent are different observational instruments as indicators of teacher effectiveness are comparable? 2) Which factors of these instruments can predict student engagement better?
Scientific significance
Since we successfully compared different classroom observation instruments, we could find factors conceptually similar are not necessarily empirically closely related. More importantly, the predictability of learner engagement among factors was not related to their conceptual similarities. An instrument like CLASS that showed ecological validity may not necessarily be the best tool for measuring student engagement.
Method
The lessons selected in this study were part of the classroom observation videos in the MET project. The MET project involved multiple classroom observation measures, including CLASS as the major one. There were three thousand teachers and approximately 10000 students in six urban districts between 2009 and 2012. We adopted a strategy to select the number of lessons in the same proportion of a stanine, such as selecting thirty-two lessons or 4% from the top or bottom stanine of the CLASS averages of the 14000+ lessons in the MET project. Initially, 440 lessons were selected based on the overall averages of CLASS, but 17 lessons were excluded for training, calibration and low video quality. The new analysis provided observational results with ICALT and CETIT on 423 lesson videos. Three observers performed lesson observations independently after training on ICALT and CETIT. They conducted calibration three times with nine lessons until inter-rater reliability was over 90% before the secondary observations. Selected lessons were assigned randomly to each observer. For the measurements, CLASS has 11 dimensions nested in 3 domains; ICALT, 32 indicators in 6 domains; CETIT, 58 indicators in 13 factors. Learner engagement was assessed based on three observable statements on students associated with ICALT. Multiple regression analyses offer a comparative statistical approach to identify better predictors of student engagement out of the three instruments. Four multiple regression analysis were conducted with SPSS 26.0 to predict learner engagement from ICALT, CETIT, and CLASS factors. The first regression analysis used all ICALT, CETIT, and CLASS factors together as independent variables, while other latter three regression analyses used ICALT, CETIT, and CLASS subscales or factors separately as independent variables.
Expected Outcomes
The overall variations between CLASS with ICALT and CETIT were more extensive than those between ICALT and CETIT, suggesting overall more substantial similarities found between the latter two instruments. However, variations within instruments were noticeably larger. For example, among the six factors of ICALT, Adjusting instructions and learner processing to inter-learner differences is mildly correlated with three factors in ICALT, Safe and stimulating learning climate (r=0.22), Efficient organisation’ (r=0.23), Clear and structured instruction (r=0.25). Among the 13 factors in CETIT, variations between factors are remarkably more considerable as well. For example, the association between Enthusiasm for teaching and Flexibility is weak. When all ICALT, CETIT, and CLASS factors were included in the multiple linear regression, none of the CLASS factors could significantly predict learner engagement, despite their close correlations (r=0.44-0.64). Factors that could predict learner engagement included Efficient organisation (β=0.26)and Intensive and activating teaching (β=0.11) in ICALT as well as Enthusiasm for teaching (β=0.45), Purposeful and relevant teaching (β=0.11), and Safe classroom climate (β=0.15) in CETIT. Factors Safe and stimulating learning climate, Efficient organisation, and Intensive and Activating teaching predicted learner engagement if only ICALT factors were included. Among the CETIT factors, Enthusiasm for Teaching, Purposeful and relevant teaching, Safe classroom climate, Reflectiveness, and Assessment for learning significantly predicted learner engagement stronger than other factors. When only 11 CLASS dimension factors are included, only Teacher sensitivity (β=0.08) could significantly predict learner engagement. These results indicated that the instruments ICALT and CETIT are similar. Factors that showed strong associations with student engagement like the CLASS factors did not necessarily have strong predictive power. The strong association between teacher enthusiasm and learner engagement suggests that teachers motivated students through their enthusiasm more than their knowledge and skills.
References
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1-73. Ko, J. Y. O. (2010). Consistency and variation in classroom practice: a mixed-method investigation based on case studies of four EFL teachers of a disadvantaged secondary school in Hong Kong (Doctoral dissertation, University of Nottingham). Ko, J. & Li, W.L. (2020, April). Effective Teaching and Inspiring Teaching in Different Learning Environments: Evidence from Cluster Analysis and SEM. American Educational Research Association Annual Meeting, San Francisco. Ko, J., Fong, K.M.Y., & Xie, Q. (2019, August). Effective and Inspiring Teaching in Math and Science Classrooms: Evidence from Systematic Classroom Observation and Implications on STEM Education. Paper presented at the World Education Research Association 2019 Focal Meeting, Tokyo, Japan. Ko, J., Sammons, P., Maulana, R., Li, W. & Kyriakides, L. (2019, April). Identifying inspiring versus effective teaching: how do they link and differ?. The 2019 Annual Meeting of American Educational Research Association (AERA), Toronto, Canada. Ko, J., & Sammons, P. (2013). Effective Teaching: A Review of Research and Evidence. CfBT Education Trust. Kington, A., Sammons, P., Regan, E., Brown, E. & Ko, J. (2014). Effective Classroom practice. Maidenhead: Open University Press. Maulana, R., André, S., Helms-Lorenz, M., Ko, J., Chun, S., Shahzad, A., ... & Fadhilah, N. (2020). Observed teaching behaviour in secondary education across six countries: measurement invariance and indication of cross-national variations. School Effectiveness and School Improvement, 1-32. Pianta, R. C., La Paro, K. M., & Hamre, B. K. (2008). Classroom Assessment Scoring System™: Manual K-3. Paul H Brookes Publishing. Teddlie, C., Creemers, B., Kyriakides, L., Muijs, D., & Yu, F. (2006). The international system for teacher observation and feedback: Evolution of an international study of teacher effectiveness constructs. Educational research and evaluation, 12(6), 561-582. Virtanen, T. E., Pakarinen, E., Lerkkanen, M. K., Poikkeus, A. M., Siekkinen, M., & Nurmi, J. E. (2018). A validation study of Classroom Assessment Scoring System–Secondary in the Finnish school context. The Journal of Early Adolescence, 38(6), 849-880. Westergård, E., Ertesvåg, S. K., & Rafaelsen, F. (2019). A preliminary validity of the classroom assessment scoring system in Norwegian lower-secondary schools. Scandinavian Journal of Educational Research, 63(4), 566-584.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.