Session Information
09 SES 14 B, Developing and Scrutinizing Tests in the Domains of Finance, Accountancy and Economics
Paper Session
Contribution
Objectives
Austria is one of the OECD countries with the largest share of students in vocationally-oriented upper secondary education (OECD, 2012). These full-time, vocational schools aim to help students to attain A-levels in commercial areas, especially in Accountancy. However, the few existing approaches to developing psychometric tests that allow assessing students’ competence in the domain Accountancy are designed for use in the German dual training system (Winther, 2010), and thus not appropriate to diagnose Austrian students’ competence at upper secondary school. Given the lack of appropriate tests the so-called WBB (Helm, 2015) was constructed – based on Evidence-Centred Assessment Design framework (ECD; Mislevy & Riconscente, 2006) and Item Response Theory (e.g. Embretson & Reise, 2000) framework – for assessing students’ academic achievement at grade 9 through 11 (using vertical scaling methods; Kolen & Brennan, 2004). However, like all paper-pencil tests the WBB has several disadvantages, such as the need for manual scoring and estimation of students’ theta, common test length for all students, etc. Thus a computer-based adaptive test (CAT) was developed in cooperation with researchers from the Institute for Informatics at JKU Linz. A first (German language) beta-version of the CAT can be investigated on http://adaptivetesting.ce.jku.at/. The main objective of the presented paper is to show how this CAT was developed by referring to essential fundamentals from content-related didactics (domain and instructional analysis), psychometrics (IRT and CAT background) and informatics (java-implementation). Furthermore we present findings from a validation study in order to proove satisfying reliability and validity of the CAT version of the WBB.
Conceptual framework
The main theoretical frameworks for constructing the computer-adaptive test in Accountancy are two-folded:
First, a test was constructed based on substantial content-related grounds. Starting with layer 1 of the ECD framework, an analysis of the domain “Accountancy” was conducted with respect to its appearance in students’ school life (e.g. analysis of text books and teacher interviews). Moreover, in accordance with the Austrian educational standards for vocational training (www.berufsbildungsstandards.at) the latent construct of interest was defined: It represents the ability to apply the system of double-entry bookkeeping and to make use of central lows of bookkeeping (Grohmann-Steiger et al., 2008). Subsequently, in line with layer 2 of the ECD approach (Domain Modeling), an assessment argument was formulated that backs our claims about students’ abilities in Accountancy. These arguments (warrants) link potential students’ responses (evidence) to our inferences about their competence. An example: Tobias has answered a number of bookkeeping problems that call for a variety of operations involving building booking records, calculating taxes and estimating a business transaction’s impact on the profit and loss of a company, and so on. We posit that if a student is able to carry out these operations he/she has mastered the basic concept of bookkeeping. This is the warrant, and the backing comes from both classroom experience and research such as that from Helm (2014). Layer 3 “Conceptual Assessment Framework” refers to the operational implementation of the assessment and asks for how to collect and evaluate data that inform the target inference about students’ ability as well as which underlying statistical model is used.
Second, against the background of IRT the underlying statistical model is the unidimensional dichotomous Rasch model. All test items were dichotomously coded and students’ response pattern is subsequently predicted by the 1pl model. However, in the framework of CAT (see below) the 2pl model is used additionally in order to improve measurement efficiency. Thus, the item bank needed for the subsequent CAT version of WBB contains item parameters for both the 1pl model (item difficulty only) and the 2pl model (item difficulty and discrimination).
Method
Expected Outcomes
References
Grohmann-Steiger, C., Schneider, W., & Eberhartinger, E. (2008). Einführung in die Buchhaltung im Selbststudium. Band I. Wien: Facultas. Helm, C. (2014). Lernen in Offenen und Traditionellen UnterrichtsSettings (LOTUS). Empirische Analysen zur Kompetenzentwicklung im Fach Rechnungswesen sowie zu förderlichen Elementen kooperativen, offenen Lernens an berufsbildenden mittleren und höheren Schulen in Österreich. Unv. Dissertation. Institut für Pädagogik und Psychologie. Johannes Kepler Universität, Linz. Helm, C. (2015). Berufsbildungsstandards und Kompetenzmodellierung im Fach Rechnungswesen. In: Bundesinstitut für Berufsbildung (Eds.), Bildungsstandards und Kompetenzorientierung. Herausforderungen und Perspektiven der Bildungs- und Berufsbildungsforschung (pp. xx-XX). xx: XX. Helm, C., Trost, S., George, A. C. & Pocrnja, M. (2015). Potentiale kognitiver Diagnose-modelle für den berufsbildenden Unterricht. In: Stock, M., Schlögl, P., Schmid, K., & Moser, D. (Hrsg.), Kompetent – wofür? Life-Skills – Beruflichkeit – Persönlichkeitsbildung (pp. 206-224). Innsbruck: StudienVerlag. Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer-Verlag. Magis, D., & Raiche, G. (2011). catR: an R package to generate IRT adaptive tests. R package version 2.1. Mislevy, R. J., & Riconscente, M. M. (2006). Evidence-centered assessment design. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 61-90). Mahwah, NJ: Erlbaum. OECD (2012). Education at a Glance 2012: OECD Indicators. OECD Publishing. http://dx.doi.org/10.1787/eag-2012-en R Core Team. (2014). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org. Wainer, H. (2000) Computerized Adaptive Testing (A Primer). New Jersey: Lawrence Erlbaum Associates. Weeks, J. P. (2010). plink: An R Package for Linking Mixed-Format Tests Using IRT-Based Methods. Journal of Statistical Software, 35(12), 1-33. URL http://www.jstatsoft.org/v35/i12/. Winther, E. (2010). Kompetenzmessung in der beruflichen Bildung. Bielefeld: Bertelsmann.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.