Session Information
09 SES 13 A, Exploring Students’ Civic Knowledge, ICT Competencies and (Further) 21st Century Skills
Paper Session
Contribution
One of the most ambitious goals in modern education is assessing 21st century skills, including communication and cooperation, as there is evidence that success mostly depends on the ability to act and communicate effectively in various situations (Michelban 2009; Kyllonen 2012). However, there is a noticeable difference in how this importance is postulated and how it actually functions (Mullis et al., 2017). Valid assessment tools are needed to determine the development of these skills.
There are a variety of communication and cooperation assessment tools that differ in definitions, operationalizations, and task forms (Evans, 2020). For instance, there are several assessment tools suitable for cross-cultural research (OECD 2017; Griffin & Care, 2015). Most assessment systems focus on assessment among high school and university students, whereas assessment systems in primary and middle school are underrepresented (Evans, 2020). Moreover, there is no valid assessment tool to measure communication and cooperation in Russia.
A measuring set was developed for two target audiences on the basis of operational definitions and conceptual models, namely Russian schoolchildren of the fourth and seventh grades.
The communication construct included six groups of measurable skills: conceptualizing the message at the idea and perception levels; understanding the context and the information about the interlocutor; expressing communicative intention, and analyzing discourse with its further correction.
Measuring cooperation skills included an assessment of the formation of a common goal, establishment of mutually binding roles in the team, ability for mutual support, and knowledge of social behavior norms.
Researchers highlight that such complex skills require an assessment framework that goes beyond traditional featureless items in the multiple-choice test format (Hao et al., 2019). The assessment framework should be flexible and authentic to capture the complex structure of the skills. Computerized performance-based tasks (CPBTs) provide an environment to represent a set of complex skills in samples of observable behavior (Liu et al., 2016). However, the assessment environment with a high level of flexibility challenges psychometricians. One of the issues that arise in the lifelike environment of CPBTs is the differential item functioning (DIF).
DIF is a difference in item performance that is realized as the difference in the probability of responding correctly to the item between students from different groups despite the fact that they have an equal level of ability. In this study, we examine gender fairness in the assessment by DIF analysis. DIF analysis can be considered useful for validating the instrument (Walker, 2011). However, it is not enough to report that tasks are functioning differently for groups, i.e., there must be a theoretical reason why this happens. Based on the meta-analyses (Anderson & Leaper, 1998; Eagly & Crowley, 1986), it can be concluded that the magnitude and the direction of gender differences depend on the communication context. For instance, if a test-taker needs to show heroic help in a task, then this will be more actively shown by boys (Eagly & Crowley, 1986). Thus, it is possible to put forward a hypothesis that there will be DIF in the CPBTs. We focus on the following research questions:
Do the CPBTs for communication and cooperation assessment demonstrate gender-related DIF among the 4-grade students?
Do the CPBTs for communication and cooperation assessment demonstrate gender-related DIF among the 7-grade students?
Method
The sample consists of 766 fourth-grade students (9–11 year olds) and 559 seventh-grade students (13-14 year olds) from the two cities in Russia’s central region. Students were tested in their schools with an administrator present. Each student was provided with a computer and had 45 minutes to solve the tasks. Test data were collected through CPBTs aimed at measuring communication and cooperation skills of fourth- and seventh-grade children. The CPBTs were developed following Evidence-Centered Design (Mislevy, Almond & Lukas, 2003) and psychometrically tested for validity. The instrument included three tasks for fourth-grade children and three tasks for seventh-grade children where respondents interacted with computer-simulated agents who differed by gender and communication style in order to solve a problem. Task problems were set in real-life or fantastic contexts. For example, in one task, the students were asked to prepare a school play with classmates. In another task, the students became part of a spaceship crew and interacted with aliens. In each task, the students were presented with answers to communicate with agents. Thus, all test-takers` actions were predefined and considered as indicators of communication and cooperation components. For fourth-grade students, CPBTs included 35 indicators of communication and 42 indicators of cooperation. For seventh-grade students, CPBTs included 27 indicators of communication and 42 indicators of cooperation. In order to address research questions, Multigroup Confirmatory Factor Analysis was performed to test the measurement invariance of the CPBTs with respect to gender. Invariance was tested for three levels, namely configural, metric, and scalar following Chen (Chen, 2007) to compare nested models by the difference in CFI statistic. The level of invariance was achieved when the difference is 0.01 or less. We also examine if gender is a significant predictor of communication and cooperation skills in CPBTs.
Expected Outcomes
As a result, the CPBTs demonstrate similar psychometric characteristics for boys and girls. The variety of task contexts and heterogeneity of computer-simulated agents provide the opportunity for valid and fair assessment for different students. However, the patterns of gender-related DIF differ for 4th and 7th graders. It can be related both to the differences in tasks as the CPBTs for 7th graders are more complex and realistic and to differences in the nature of gender identity as gender segregation is increased in middle school (Bussey, 2011). The paper argues that psychometric studies can not be separated from the sociocognitive frames that unavoidably influence the assessment (Mislevy, 2018). We demonstrate how psychometric models can provide answers to questions about varying sources affecting the students’ behavior.
References
OECD (2017). PISA 2015 collaborative problem-solving framework. Retrieved from https://www.oecd.org/pisa/pisaproducts/ Anderson, K. J., Leaper, C. (1998). Meta-analyses of gender effects on conversational interruption: Who, what, when, where, and how. Sex Roles, 39, 225–252. Bussey, K. (2011). Gender identity development. In Handbook of identity theory and research (pp. 603-628). New York, NY: Springer. Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464-504. Eagly, A. H., Crowley, M. (1986). Gender and helping behavior: A meta-analytic review of the social psychological literature. Psychological Bulletin, 100, 283–308. Evans, C. M. (2020). Measuring student success skills: A review of the literature on collaboration. Dover, NH: National Center for the Improvement of Educational Assessment. Griffin, P., Care, E. (2015). The ATC21S method. In Assessment and teaching of 21st Century Skills (pp. 3-33). Dordrecht: Springer. Hao, J., Liu, L., Kyllonen, P., Flor, M., & von Davier, A. A. (2019). Psychometric considerations and a general Scoring Strategy for Assessments of Collaborative Problem Solving. ETS Research Report Series, 2019 (4), i-17. Kyllonen, Patrick. (2012). Measurement of 21st Century Skills Within the Common Core State Standards. Liu, L., Hao, J., von Davier, A. A., Kyllonen, P., & Zapata-Rivera, J. (2016). A Tough Nut to Crack: Measuring Collaborative Problem Solving. In Rosen, Y., Ferrara, S., & Mosharraf, M. (Ed.), Handbook of Research on Technology Tools for Real-World Skill Development (pp. 344-359). IGI Global. Michelban B. (2009) Effective communication: The key to career success and great leadership. J Healthc Prot Manage, 25 (1), 9‑13. Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A brief introduction to evidence‐centered design. ETS Research Report Series, 2003 (1), i-29. Mislevy, R. J. (2018). Sociocognitive Foundations of Educational Measurement. New York, NY: Routledge. Mullis, I.V.S., Martin, M.O., Foy, P., & Hooper, M. (2017). PIRLS 2016 International Results in Reading. Chestnut Hill, MA: Boston College. Walker, C. M. (2011). What’s the DIF? Why differential item functioning analyses are an important part of instrument development and validation. Journal of Psychoeducational Assessment, 29(4), 364–376.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.