Investigation of Aberrant Response Patters across Subgroups Using Person-Response Curves

Author(s):

Ilker Kalender

Conference:

ECER 2010

Network:

9. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 04 B, Assessment: Methods and Applications III

Paper Session

Time:

2010-08-25

16:00-17:30

Room:

P673, Porthania

Chair:

Theo Eggen

Contribution

Purpose of the present study is to investigate distribution of aberrant response patterns across gender, school type, and graduation status (at last grade of high school, recently graduated, etc.) using person-response curves (PRC).

In multiple-choice tests, unexpected behaviors of examinees are considered to be a threat for validity. These behaviors of responses – also called aberrant responses – may stem from cheating, guessing, fatigue or careless, etc. of examinees. For example, if a low-ability examinee answers the hardest items in a test correctly, this can be used a trigger for suspicion of guessing or cheating. Another examinee may give wrong answers to the easiest items in a test and this may indicate a careless person.

One of the methods to investigate individual aberrancy is called person-response curve and proposed by Trabin and Weiss (1983). PRC is an Item Response Theory (IRT) based methodology and relates probability to answer correctly to a group of items with item difficulty. In this approach, discrepancy between observed and expected PRCs is investigated by using a chi-square statistic. Observed curves can be obtained by calculating proportions of correct answer in each strata. To construct expected PRC, items are ordered according to their difficulty levels and groups of items called “strata” are formed. Expected PRC provides a norm to make comparisons with observed curve. Then a statistical test was used to detect discrepancy between observed and expected PRCs. Though there are many indices that are used to detect aberrant response patterns, principle advantage of PRCs is that they provide a visual way to interpret aberrancy.

Comparative investigation of aberrant response patterns across subgroups may yield significant information about differences on examinees’ individual behaviors in testing environments.

Method

In the present study, using PRCs on a real data set, distribution of aberrant response patterns across subgroups was investigated. Data comes from a high-stake test in Turkey, results of which are used for placement to higher education programs. The test includes science, mathematics, social science and Turkish items and examinees give responses according to their program preferences. A sample of examinees that answer quantitative part including science and mathematics items was used to investigate their aberrancy. A total of 60 items was grouped into 10 strata, each of which includes 6 items. Item parameters were estimated by using 3-parameter IRT model. PRCs for a sample of approximately 50.000 examinees were obtained and chi-square test was conducted to identify examinees showing aberrant response patterns in the test. To reveal distribution of examinees showing aberrancy of gender, school type and graduation status as expected, and follow-up analyses were conducted.

Expected Outcomes

Results of the present study indicated that there were different types of aberrant response patterns. One of the widely occurred patterns is characterized by item selection between science and mathematics items. While examinees answers many items in one subtest including hardest ones, they hardly give correct answers to even easiest item in other subtest. In another distinct pattern arose, while examinees followed a regular pattern on the test, proportion correct for the hardest items showed a significant increase. Additional analyses were showed significant differences across school types and graduation status, but no significant deviation for across gender. There can be several reasons for aberrant response patterns. In school examinees may take some courses more than others and therefore they focus on particular items. Or in the test low-ability examinees may show guessing behavior on the hardest items they are not able to solve. PRCs can be helpful to identify examinees’ unexpected behaviors on tests. Also they can be used as an indicator to educational effectiveness by investigating examinees answering only a part of items.

References

Nering, M. L., & Meijer, R. R. (1998). A comparison of the person response function and the lz person-fit statistic. Applied Psychological Measurement, 22, 53-69. Meijer, R. R. & Sijtsma, K. (2001). Methodology Review: Evaluating Person Fit. Applied Psychological Measurement 2001; 25; 107-135 Meijer, R. R.,&Sijtsma, K. (1995). Detection of aberrant item score patterns: A review of recent developments. Applied Measurement in Education, 8, 261-272. Trabin, T. E.,&Weiss, D. J. (1983). The person response curve: Fit of individuals to item characteristic curve models. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 83-108). New York: Academic Press.

Author Information

Ilker Kalender

Bilkent University

Computer Technology and Programming

Anara

Update Modus of this Database

The current conference programme can be browsed in the conference management system (conftool) and, closer to the conference, in the conference app.
This database will be updated with the conference data after ECER.

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance, please use the conference app, which will be issued some weeks before the conference and the conference agenda provided in conftool.
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.