Session Information
09 ONLINE 29 B, Trials of New Assessment Methods in Post-secondary Education
Paper Session
MeetingID: 859 3745 6622 Code: Z5HZrz
Contribution
There is broad consensus within and across several disciplines that the knowledge students acquire in higher education should be assessed using reliable and valid tests. Such tests should be based on the validity argument that at least parts of the knowledge measured in tests has been imparted in the course of study (Mislevy, 2018). Consequently, greater study progress should lead to better results in knowledge tests. In economics, too, several standardized tests in closed-ended formats were developed to assess students’ knowledge in various content areas (e.g., the German WiwiKom test; Zlatkin-Troitschanskaia et al., 2019) that adapted the US Test of Economic Literacy (Walstad et al., 2013) and the US Test of Understanding in College Economics (Walstad et al., 2007). Economic knowledge comprises linguistic and visual mental representations of economic concepts, which are embedded in the curriculum areas of a study program (e.g., accounting, macroeconomics; Davies & Mangan, 2007, Dochy et al., 1991).
In addition to a traditional analysis of the correlations between test scores and indicators of study progress (e.g., study term), the use of response process data collected while students solve test tasks allows for an in-depth analysis of their mental states during task-solving (Ercikan & Pellegrino, 2017). Response processes are a substantive criterion of validity, reflecting the extent to which empirical evidence of mental states during task-solving are related to construct-relevant features (e.g., students’ study progress) and test scores (AERA et al., 2014).
Since eye tracking provides insight into the spatiotemporal structure of test processing based on visual indicators (e.g., fixation duration), it is increasingly applied in the analysis of standardized tests in various disciplines (Han et al., 2017, Lindner et al., 2014, Tsai et al., 2012). Current eye tracking-based test validations often focus on correlations between the score (correct vs. wrong) and other test-immanent features (e.g., response options in a multiple-choice-item) with eye tracking metrics (e.g., fixation duration on a response option). An investigation taking into account further external validation criteria like study progress is rare. Therefore, this paper presents the results of an eye tracking-based test validation, which, in addition to the test scores, includes further construct-relevant validation criteria describing the study progress to allow a better understanding of students’ response behavior in frequently used economic tests.
Eye tracking research is based on two general assumptions: The immediacy assumption states that gaze behavior provides insights into learner’s cognitive activities (Just & Carpenter 1980), i.e., cognitions that occur during an action, e.g., task-solving. The eye-mind assumption states that the learner’s gaze, directed at a particular object at a particular time, provides an indication of the learner’s attention and, indirectly, an indication of the information processing taking place (Holmqvist et al., 2011). For validation purposes, Lindner et al. (2014) applied the so-called “gaze bias effect” on closed-ended test formats, and found that a relatively long fixation duration on a response option (distractor/attractor) is positively correlated to the students’ selection of this option. Assuming that the knowledge to be measured increases over the course of studies, advanced students should be more likely to select the correct response option (attractor), and this should be reflected in a longer fixation duration. Conversely, less advanced students should rather tend to select wrong response options (distractors). Based on prior research, the following hypotheses for an economic test are examined in this paper:
H1:The greater the study progress, the higher the test scores.
H2:The greater the study progress, the higher the fixation duration for the attractor.
H3:The greater the study progress, the lower the fixation duration on distractors.
Method
To assess students’ economic knowledge, the short version of the WiwiKom-test consisting of 25 items focusing on verbal representations was used (Zlatkin-Troitschanskaia et al., 2019). Moreover, a newly developed and validated graph-test consisting of 15 items was used to capture graphical representations in economics. Both closed-format tests were presented on a 22-inch computer screen (1920x1080 pixels). We analyze the test scores as well as the students’ fixation duration, which provides insights into the cognitive processing of information in specific areas of the test items. These so-called Areas of Interest include the item stem, the attractor and the distractors. Eye tracking was conducted using a stationary X3-120 Tobii (120Hz), placed below the monitor. Fixations were measured on a millisecond basis using the identification by velocity threshold (I-VT) filter with a threshold of 30°/s of visual angle. Since economics represents a sub-area of partly overlapping business and economics topics in undergraduate study programs (Dochy et al., 1991), this differentiation should be taken into account when modeling student progress. The number of completed economics courses is a suitable indicator for the investigation of the selection for an attractor of an economics item. The number of completed semesters or accrued credit points provide an insight into the general understanding of selected concepts within the study domain. Thus, a questionnaire was used to collect multiple indicators for assessing study progress and further personal characteristics. We also assessed students’ intelligence (IST2000R, Liepmann et al., 2007), their interest in economics and sociodemographic data (e.g., school education, school leaving grade). 53 economics education students (27 female; age (MW=24.3, SD=3.02; score WiwiKom-test (MW=14.15, SD=5.54; max. 25 points); score graph-test (MW=6.91, SD=2.60; max. 15 points) with varying levels of study progress (semester: MW=3.84, SD=2.49) participated in the study. Following correlation analyses based on aggregated test scores per participant, a single-item processing analysis was performed that takes into account the nested data structure (items within students) and uses multilevel models with crossed random effects (the item score being the dependent variable; Rabe-Hesketh & Skrondal, 2012). This corresponds to 1.325 response processes (53x25) for the WiwiKom-test and 795 response processes (53x15) for the graph test. The average fixation duration for the WiwiKom-test was 12.73 minutes (minimum=7.59 minutes; maximum=20.27 minutes), which results in an average processing time of 30.55 seconds per task; for the graph-test 14.11 minutes (minimum=7.27 minutes; maximum=23.27 minutes), and an average processing time of 56.44 seconds per task.
Expected Outcomes
Regarding H1, we found significant positive correlations between the number of semesters completed and the aggregated test scores (WiwiKom-test r=.45; p<.01; graph-test r=.42; p<.05). Considering different business and economics courses, ANOVAs showed only for macroeconomics courses that students with more than one successfully completed course achieved higher test scores. On average, these students correctly solved eight more items (0 courses=10; 1 course=13.95; >1 courses=18.22; p<.05) in the WiwiKom test and almost 4 more items (0 courses=5.00; 1 course=6.81; >1 courses=8.78; p<.05) in the graph-test than students with no completed macroeconomics course. Regarding H2, the number of semesters completed had no significant correlation with the fixation duration on the attractor. For the WiwiKom-test, students who successfully completed more than one macroeconomics course had, on average, a significantly higher fixation duration on the attractor (11s) than students with fewer completed macroeconomics courses (7s; p<.05). Regarding H3, a higher number of completed semesters correlated with a significantly lower fixation duration on the distractors and, on average, a lower fixation duration on the distractors correlated with a higher total score for the WiwiKom-test (r=-.41, p<.05) and the graph-test (r=-.33; p<.05). An effect reported in prior research (Klein et al., 2019) that a shorter fixation duration on all items is correlated with higher test scores was not found. In the main paper, a multi-level analysis based on the single-item approach that provides insights into the identified correlations are presented (e.g., whether a change in fixations becomes evident for certain item features as a function of study progress). These and further findings are discussed in terms of their possible explanations, study limitations (e.g., sampling) and implications for further research. Major potentials and challenges of the usage of eye-tracking analyses for test validation purposes in economic education are outlined, and some practical implications (e.g., pre-post-assessments) are provided.
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards of educational and psychological testing. AERA, APA & NCME. Davies, P., & Mangan, J. (2007). Threshold concepts and the integration of understanding in economics. Studies in Higher Education, 32(6), 711–726. https://doi.org/10.1080/03075070701685148 Dochy, F. J., Valcke, M. M., & Wagemans, L. J. (1991). Learning economics in higher education: An investigation concerning the quality and impact of expertise. Higher Education in Europe, 16(4), 123–136. https://doi.org/10.1080/0379772910160413 Ercikan, K., & Pellegrino, J. W. (Eds.). (2017). NCME applications of educational measurement and assessment book series. Validation of score meaning for the next generation of assessments: The use of response processes. Routledge. Han, J., Chen, L., Fu, Z., Fritchman, J., & Bao, L. (2017). Eye-tracking of visual attention in web-based assessment using the Force Concept Inventory. European Journal of Physics, 38(4), 45702. https://doi.org/10.1088/1361-6404/aa6c49 Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & van de Weijer, J. (2011). Eye tracking: A comprehensive guide to methods and measures (1st ed.). Oxford University Press. Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87(4), 329–354. Liepmann, D., Beauducel, A., Brocke, B., & Amthauer, R. (2007). Intelligenz-Struktur-Test 2000 R. Hogrefe. Lindner, M. A., Eitel, A., Thoma, G.-B., Dalehefte, I. M., Ihme, J. M., & Köller, O. (2014). Tracking the decision-making process in multiple-choice assessment: Evidence from eye movements. Applied Cognitive Psychology, 28(5), 738–752. https://doi.org/10.1002/acp.3060 Mislevy, R. J. (2018). Socio-cognitive foundations of educational measurement. Routledge. Rabe-Hesketh, S., & Skrondal, A. (2012). Multilevel and longitudinal modeling using stata. Volume 1: Continuous responses (3rd ed.). Stata Press. Tsai, M.-J., Hou, H.-T., Lai, M.-L., Liu, W.-Y., & Yang, F.-Y. (2012). Visual attention for solving multiple-choice science problem: An eye-tracking analysis. Computers & Education, 58(1), 375–385. https://doi.org/10.1016/j.compedu.2011.07.012 Walstad, W. B., Watts, M., & Rebeck, K. (2007). Test of understanding in college economics: Examiner's manual (4th ed.). National Council on Economic Education. Walstad, W. B., Rebeck, K., & Butters, R. B. (2013). The test of economic literacy: Development and results. The Journal of Economic Education, 44(3), 298¬–309. https://doi.org/10.1080/00220485.2013.795462 Zlatkin-Troitschanskaia, O., Jitomirski, J., Happ, R., Molerov, D., Schlax, J., Kühling-Thees, C., Förster, M., & Brückner, S. (2019). Validating a Test for Measuring Knowledge and Understanding of Economics Among University Students. Zeitschrift Für Pädagogische Psychologie, 33(2), 119-133. https://doi.org/10.1024/1010-0652/a000239
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.