Session Information
09 SES 03 C, Methodological Issues in Tests and Assessments
Paper Session
Contribution
Educational researchers and practitioners name several advantages of computer-based testing (CBT) over classic paper-pencil procedures: CBT is more standardized and economic in test delivery, more flexible and efficient in test assembly, faster and less prone to errors in test scoring (Kröhne & Martens, 2011). CBT can implement innovative task formats (e.g., video sequences) and assess more diagnostic information (e.g., reaction time). It also enables adaptive testing (e.g. Wang & Shin, 2010). However, researchers caution against so-called mode effects: Differences in test delivery and setting between computerized and paper-pencil procedures might cause different psychometric values, influence the testing experience, or take a bias towards certain sub-populations (e.g. Kröhne & Martens, 2011).
The first studies on mode effects were conducted in the 1980s. Since then, computers have become an everyday and familiar tool for many people, and user interface has improved massively. These changes in usability and customization entail changes in research results: Several early studies on mode effects in intelligence and achievement testing attested poorer results for the CBT format, especially in speed testing (e.g. Mead & Drasgow, 1993). However, contemporary studies fail to find these significant mode effects within such tests (Wang et al., 2007; Wang et al., 2008; Poggio et al., 2005). Likewise, creativity tests render the same results irrespective of test administration mode (Lau & Cheung, 2010).
Moreover, it has been hypothesized that study participants might answer questions about personal or intimate issues more openly and with less concern for social desirability in a more “anonymous” CBT setting. However, no such mode effects were found in studies regarding personality tests (e.g. Bartram & Brown, 2004), mental health (Gwaltney, Shields & Shiffman, 2008), and other sensitive topics (e.g. drug use or sexual behavior; Tourangeau & Yan, 2007). Recent results show that CBT is well received by study participants: They perceive computerized achievement tests to be easier (irrespective of the actual results; Park, 2003), are more engaged and motivated in a CBT setting (Goldberg, Russell & Cook, 2003), and generally express a more positive attitude towards CBT than towards classic paper-pencil procedures (Wang, Young & Brooks, 2004). However, it should be noted that participants in these three studies were students who may be particularly open to new media.
In summary, meta-analyses on mode effects report no moderating effects of computer practice in achievements tests (Wang et al., 2007) and personal questionnaires (Gwaltney, Shields & Shiffman, 2008). Despite this, some authors stress that computer- and paper-based tests can never be truly equivalent (Noyes & Garland, 2008). In accordance with this argument, it seems important to compare CBT versus paper-pencil data for testing tools that not just represent questionnaires and knowledge tests but also measure realistic, action-oriented skills. In our studies, we measure the ability of (pre-service) teachers and teacher students to recognize, correct, and handle student errors adequately in the domain of accounting and bookkeeping. For that purpose, we use sample student assignments that contain typical student mistakes which have to be corrected. In a previous study (Türling, Seifried & Wuttke, 2012), this was done in the form of a paper-pencil test. For a recent research project, it is aimed to use CBT because of the advantages described above, such as using embedded video clips. Since correcting student assignments is a typical task that is usually accomplished on paper, the presented study aims to compare results of the initially used paper-pencil test and a lightly adapted CBT version.
Method
Expected Outcomes
References
Bartram, D. & Brown, A. (2004). Online Testing: Mode of administration and the stability of OPQ 32i scores. International journal of selection and assessment, 12(3), 278-284. Goldberg, A., Russell, M. & Cook, A. (2003). The effect of computers on student writing: A meta-analysis of studies from 1992 to 2002. Journal of technology, learning, and assessment, 2(1). Gwaltney, C.J., Shields, S.L. & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: a meta-analytic review. Value in health, 11(2), 322-333. Kröhne, U. & Martens, T. (2011). Computer-based competence tests in the national educational panel study: The challenge of mode effects. Zeitschrift für Erziehungswissenschaft, 14, 169-186. Lau, S. & Cheung, P.C. (2010). Creativity assessment: Comparability of the electronic and paper-and-pencil versions of the Wallach-Kogan Creativity Tests. Thinking skills and creativity, 5, 101-107. Mead, A.D. & Drasgow, F. (1993). Equivalence of computerized and paper- and-pencil cognitive ability tests: A meta-analysis. Psychological Bulletin, 114(3), 449-458. Noyes, J.M. & Garland, K.J. (2008). Computer- vs. paper-based tasks: Are they equivalent? Ergonomics, 51(9), p.1352-1375. Park, J. (2003). A test-taker’s perspective. Education week, 22(35), 15. Poggio, J., Glasnapp, D.R., Yang, X. & Poggio, A.J. (2005). A comparative evaluation of score results from computerized and paper-and-pencil mathematics testing in a large scale state assessment program. Journal of technology, learning, and assessment, 3(6). Richter, T., Naumann, J. & Horz, H. (2010). Eine revidierte Fassung des Inventars zur Computerbildung (INCOBI-R). Zeitschrift für Pädagogische Psychologie, 24(1), 23-37. Tourangeau, R. & Yan, T. (2007). Sensitive questions in surveys. Psychological Bulletin, 133(5), 859-883. Türling, J. M., Seifried, J., & Wuttke, E. (2012). Teachers‘ knowledge about domain specific student errors. In E. Wuttke & J. Seifried (eds.), Learning from Errors at School and at Work (pp. 95-110). Opladen & Farmington Hills: Barbara Budrich. Wang, H. & Shin, C.D. (2010). Comparability of computerized adaptive and paper-pencil tests. Test, measurement and research services bulletin, 13, 1-7. Wang, S., Jiao, H., Young, M.J., Brooks, T. & Olson, J. (2007). A meta-analysis of testing mode effects in grade K 12 mathematics tests. Educational and psychological measurement, 67, 219-238. Wang, S., Jiao, H., Young, M.J., Brooks, T. & Olson, J. (2008). Comparability of computer-based and paper-and-pencil testing in K12 reading assessment: A meta-analysis of testing mode effects. Educational and psychological measurement, 68(1), 5-24. Wang, S., Young, M.J. & Brooks, T.E. (2004). Administration mode comparability study for Stanford Diagnostic Reading and Mathematics Tests. San Antonio, TX: Harcourt Assessment.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.