Innovative Items: Comparison of the Reliability of Scoring Methods of Multiple-Response Items, Matching Items, and Sequencing Items
Author(s):
Conference:
ECER 2010
Format:
Paper

Session Information

09 SES 11 C, Issues in Computer-Based Assessement

Paper Session

Time:
2010-08-27
14:45-16:15
Room:
P617, Porthania
Chair:
Wilfried Bos

Contribution

Multiple-response items, sequencing items, and matching items are three innovative item types that offer the benefit of polytomous scoring and the possibility to measure partial knowledge. In the present study, different scoring methods of these three innovative item types were compared. Based on the assumption that different response patterns to these item types represent different knowledge levels, these knowledge levels are described. Features of different scoring methods were studied to select the scoring methods included in this study. Subsequently, a probability distribution of scoring results for each knowledge level was derived and computed. Based on classical test theory, a measure for the reliability of the different scoring methods on the level of a single item was derived. To compare the results of the scoring methods selected, reliabilities were computed for several distributions of knowledge levels in a population. For a multiple-response item, when an examinee must select all the right options, the dichotomous scoring method resulted in higher reliabilities than scoring the response patterns polytomously. For matching items and for multiple-response items, when an examinee is asked to select fewer options than the total number of right options given, polytomous scoring methods gave higher reliabilities than the dichotomous scoring method. Simple polytomous scoring by counting the selected right options or relations is recommended instead of more complex polytomous scoring methods, for instance, using a correction for wrong answers or a so-called ‘floor’. The results of scoring sequencing items were not as conclusive as for the other two innovative item types explored.

Method

Development of scoring rules for automatically scored items in computer based testing and of a criterion to compare them. Results are obtained by simulation studies.

Expected Outcomes

Advice for optimal scoring rules for multiple-response Items, matching Items, and sequencing Items

References

Parshall, C.G., Davey, T. & Pashley, P.J. (2000) Innovative item types for computerized testing. (pp. 129-148) In: W.J. Van der Linden, & Glas, C.A.W. (Eds.) Computerized Adaptive Testing: Theory and Practice. Dordrecht: Kluwer Academic Publisher. Scalise, K., & Gifford, B. R. (2006). Computer-Based Assessment in E-Learning: A Framework for Constructing "Intermediate Constraint" Questions and Tasks for Technology Platforms. Journal of Teaching, Learning and Assessment, 4.

Author Information

Cito/ University of Twente
Arnhem
Cito
Arnhem

Update Modus of this Database

The current conference programme can be browsed in the conference management system (conftool) and, closer to the conference, in the conference app.
This database will be updated with the conference data after ECER. 

Search the ECER Programme

  • Search for keywords and phrases in "Text Search"
  • Restrict in which part of the abstracts to search in "Where to search"
  • Search for authors and in the respective field.
  • For planning your conference attendance, please use the conference app, which will be issued some weeks before the conference and the conference agenda provided in conftool.
  • If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.