Reliability of a Competence Assessment Program
Conference:
ECER 2010
Format:
Paper

Session Information

09 SES 03 B, Assessment: Methods and Applications II

Paper Session

Time:
2010-08-25
14:00-15:30
Room:
P673, Porthania
Chair:
Jaan Mikk

Contribution

One of the current trends in education is the shift towards more competence-based education (Baartman, Bastiaens, Kirschner & Van der Vleuten, 2007). In the Netherlands, for example, the ministry of education decided that all vocational education institutes must formulate their curriculum according to principles of competence-based education which has led to concomitant changes in learning outcomes. Whereas students used to be taught knowledge and skills separately, they are now acquiring competences in which knowledge, skills, and attitudes are integrated. One of the implications of this change in educational emphasis is an increased use of competence assessments such as performance assessments, situational judgement tests, and portfolio assessments (Baartman, Bastiaens, Kirschner & Van der Vleuten, 2006). These competence assessments are often combined into a competence assessment program (CAP) that is aligned with the competence-based curriculum. Within a context of competence-based education CAPs are used for high stake decisions on students, therefore, it is important that the quality of these CAPs is ensured.

 

An important aspect of the quality of assessments in general is reliability, and also for CAPs it is important to quantify the precision of scores. Like test validity, reliability must be interpreted relative to particular testing purposes and context (Haertel, 2006). Within the context of a CAP there are some elements that need to be considered regarding the estimation of reliability. First of all it should be considered that a CAP results in one decision about a student, therefore, the accuracy of exact scores on the elements of the CAP is less important than the accuracy of the decision made about the student (Nitko & Brookhart, 2007). And furthermore, when continuous scores are interpreted with respect to one or more cut scores, conventional indices of reliability may not be appropriate. Since the score interpretation of a CAP is usually standards-based and makes use of cut scores to a classify students in a series of performance levels, it may be more useful to derive probabilities of misclassification from standard errors (Haertel, 2006).

 

In order to evaluate the reliability of CAPs we propose to estimate the percentage of misclassification of the decision that is made. Verstralen (2009a; 2009b) showed in his reports how this can be done for a combination of tests. Within this method the classification accuracy is estimated with regard to cut scores and decision rules that are used to aggregate the results of different tests (Van Rijn, Béguin & Verstralen, 2009). Within this study we will apply this method to the results of an actual CAP to investigate whether the estimation of the percentage of misclassification can be used to evaluate the reliability of a competence-based exam program.

Method

Within this study the percentage of misclassification is estimated for a competence-based education program. The program is situated in Dutch vocational education and at an intermediate level. It consist of four years of education and within these four years 22 key subjects must be completed. Every key subject consists of several exams: in total 72 exams are administered during 4 yours of education. For this study data is available for a full cohort (class of 2008). The reliability analyses will be performed on this data. First it will be studied whether the method of Verstralen (2009) can be applied for this CAP and if necessary, the method will be adapted slightly. Secondly, the percentage of misclassification for every key element is estimated. And at last, the percentage of misclassification for the combination of the 22 key elements is estimated.

Expected Outcomes

This study will result in a paper on evaluation of reliability of CAPs by using the method to estimate the percentage of misclassification. Within the paper final results of our study will be reported. We do expect to find that reliability of the decision of a CAP can be expressed in a percentage of misclassification. And that the percentage of misclassification can be used for the evaluation of the quality of a CAP or to enhance improvement of the CAP as a whole or the individual assessments. We also expect that it might be possible to decide on an optimal combination of exams in a CAP to reduce the percentage of misclassification. This optimal combination should fit within a set of content driven constraints.

References

Baartman, L. K., Bastiaens, T. J., Kirschner, P. A., & van der Vleuten, C. P. (2006). The wheel of competency assessment: Presenting quality criteria for competency assessment programs. Studies in Educational Evaluation, 32, 153-170. Baartman, L. K., Bastiaens, T. J., Kirschner, P. A., & van der Vleuten, C. P. (2007). Evaluating assessment quality in competence-based education: A qualitative comparison of two frameworks. Educational Research Review, 114 – 129. Haertel, E. H. (2006). Reliability. In R. L. Brennan (ed.), Educational Measurment 4th edition. (pp. 65-110). Westport: American Council on Education and Praeger Publishers. Nitko, A.J. and Brookhart, S.M. (2007). Educational Assessment of Students (5th Edition). Upper Saddle River ,NJ: Pearson Education. Van Rijn, P., Béguin, A., & Verstralen, H. (2009). Failing or Passing? Measurement Precision of Examinations in Secondary Education. Pedagogische Studiën, 86, 185-195. Verstralen, H. (2009). Accuracy of Exams: CTT and IRT Compared. Internal report, Cito. Verstralen, H. (2009). Quality of Certification Decisions. Internal report, Cito.

Author Information

Cito
Arnhem
Cito, The NetherlandsRCEC, The NetherlandsUniversity of Twente, The Netherlands

Update Modus of this Database

The current conference programme can be browsed in the conference management system (conftool) and, closer to the conference, in the conference app.
This database will be updated with the conference data after ECER. 

Search the ECER Programme

  • Search for keywords and phrases in "Text Search"
  • Restrict in which part of the abstracts to search in "Where to search"
  • Search for authors and in the respective field.
  • For planning your conference attendance, please use the conference app, which will be issued some weeks before the conference and the conference agenda provided in conftool.
  • If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.