Session Information
09 SES 03 A, Comparing Computer- and Paper-Based-Assessment
Paper Session
Contribution
To benefit from the possibilities of technology-based testing, an existing paper-based assessment (PBA) needs to be transferred to a computer-based assessment (CBA). In longitudinal studies, such as the National Educational Panel Study (NEPS; Blossfeld, Roßbach, & von Maurice, 2011) in Germany, the comparability of ability estimates measured over time are a fundamental requirement for valid interpretations of change scores and precise comparisons of ability distributions between cohorts. Hence, the replacement of PBA with CBA must be prepared carefully and consequences of the mode change need to be investigated.
Previous research revealed heterogeneous mode effects that are not predictable without empirical investigation (e.g., Wang, 2008). The risk of mode effects differs between domains and is increased with the complexity of items, i.e., it can be supposed that the response format is a possible predictor for mode effects, as it may differ in complexity between modes (e.g., Heerwegh and Loosveldt, 2002). For example, assignment tasks are of higher complexity. They are typically used in reading tests, when the assignment of given headings to paragraphs of the text is required. Assignment tasks can be computerized using so-called combo boxes (or drop-down boxes) and were found to be more difficult than assignment tasks on paper tests (Heerwegh & Loosveldt, 2002). Moreover, previous findings give reason to assume that reading tests are more susceptible for mode effects when scrolling in longer texts and navigation between tasks within a unit are required (e.g., Poggio, Glasnapp, Yang, & Poggio, 2005; Pommerich, 2004).
The ongoing transition from PBA to CBA in the NEPS is accompanied by additional experimental mode effect studies to learn more about whether it makes a difference if one takes a reading test on computer or on paper. For this presentation we are analyzing data of two reading tests (for more details see Gehrer, Zimmermann, Artelt & Weinert, 2013) of different grades (seven and twelve) that were computerized and administered in a between-subject design where students were randomly assigned to modes. In addition, each student completed a common PBA reading test of a lower grade as well as a test for basic computer skills (BCS) used as external criteria to inspect construct equivalence.
To evaluate mode effects, appropriate equivalence criteria need to be derived from the intended use of test scores and test score interpretations (Buerger, Kroehne, & Goldhammer, 2016). Therefore, the following research questions were investigated for each test: Do CBA and PBA measure the same underlying construct? Is reliability equal between modes? Are the item parameters invariant between modes? Is there a homogeneous shift in item difficulty on computer? Can mode effects be explained by item properties such as the response format or navigation requirements?
Method
Expected Outcomes
References
American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for Educational and Psychological Testing. Washington: AERA, APA, NCME. Blossfeld, H.-P., Roßbach, H.-G, & von Maurice, J. (Eds.) (2011). Education as a Lifelong Process – The German National Educational Panel Study (NEPS). [Special Issue] Zeitschrift für Erziehungswissenschaft, 14. Buerger, S., Kroehne, U., & Goldhammer, F. (2016). The Transition to Computer-Based Testing in Large-Scale Assessments: Investigating (Partial) Measurement Invariance between Modes. Psychological Test and Assessment Modeling, 58 (4), 487-606. Gehrer, K., Zimmermann, S., Artelt, C. & Weinert, S. (2013). NEPS framework for assessing reading competence and results from an adult pilot study. Journal for educational research online, Volume 5 (No. 2), 50–79. Heerwegh, D. & Loosveldt, G. (2002). An Evaluation of the Effect of Response Formats on Data Quality in Web Surveys. Social Science Computer Review, 20 (4), 471–484. Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer-based testing. Educational Measurement: Issues and Practices, 20 (3), 16–25. International Test Commission (ITC). (2005). International Guidelines on Computer-Based and Internet Delivered Testing. Retrieved from https://www.intestcom.org/files/guideline_computer_based_testing.pdf Kiefer, T., Robitzsch, A., & Wu, M. (2015). TAM: Test analysis modules. (R package version 1.15-0). Muthén, L.K., & Muthén, B.O. (1998-2015). Mplus User’s Guide. Seventh Edition. Los Angeles, CA: Muthén & Muthén. Parshall, C. G., Spray, J. A., Kalohn, J. C., & Davey, T. (2002). Practical considerations in computer-based testing. New York: Springer. Penfield, R. D., & Camilli, G. (2007). Differential item functioning and item bias. In C. R. Rao, & S. Sinharay (Eds.), Handbook of Statistics: Vol. 26. Psychometrics, (pp.125–167). New York, NY: Elsevier. Poggio, J., Glasnapp, D. R., Yang, X. & Poggio, A. J. (2005). A Comparative Evaluation of Score Results from Computerized and Paper & Pencil Mathematics Testing in a Large Scale State Assessment Program. The Journal of Technology, Learning, and Assessment, 3 (6). Pommerich, M. (2004). Developing Computerized Versions of Paper-and-Pencil Tests: Mode Effects for Passage-Based Tests. The Journal of Technology, Learning, and Assessment, 2 (6). R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available from http://www.R-project.org. Wang, S., Jiao, H., Young, M. J., Brooks, T. & Olson, J. (2008). Comparability of Computer-Based and Paper-and-Pencil Testing in K 12 Reading Assessments: A Meta-Analysis of Testing Mode Effects. Educational and Psychological Measurement, 68 (1).
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.