A German Digital Reading Test for Grades 3 to 4 considering Second Language Learners: Development and Piloting

Author(s):

Lisa Paleczek(presenting / submitting)Susanne Seifert(presenting)

Conference:

ECER 2021

Network:

31. LEd – Network on Language and Education

Format:

Paper

Session Information

31 SES 14 A, Enhancing Learners’ Reading and Writing Skills via Intervention and Assessment

Paper Session

Time:

2021-09-09

11:00-12:30

Room:

n/a

Chair:

Marion Döll

Contribution

Reading comprehension is defined as the ability to understand the meaning of a written word, sentence or text (Perfetti, Landi, & Oakhill, 2005). This involves processes on different hierarchical levels, i.e. on word, sentence and text level (e.g. Mullis & Martin, 2015). In Austria, as in many other European countries, the ability to successfully acquire reading comprehension skills greatly impacts a student’s future school career (Breit, Bruneforth, & Schreiner, 2016). Due to this importance, reading instruction needs to address individual learner’s needs. In order to provide every student with adequate support, it is necessary to assess students’ reading abilities and thus, their individual learning starting points. Checking their learning progress regularly, however, is another pillar to the implementation of high-quality inclusive reading instruction (Ready & Wright, 2011). Nevertheless, embedding the (repeated) assessment of students’ reading skills into the teaching routine is often challenging for teachers in terms of time (especially when it comes to evaluate single assessment performances) and organization. Using digital assessments can support teachers in this process, especially in the aspects of preparing, conducting, evaluating and documenting one-time (status diagnostics) or repeated (learning progress diagnostics) assessments (Cheung & Slavin, 2012).

A digital assessment instrument that was developed and piloted and will be presented in this paper. More specifically, the development phases as well as the piloting for reliability and validity measures are part of the presentation. The digital instrument focuses on digitally assessing reading comprehension skills in Grades 3 and 4 students.

This digital reading test covers three domains referring to reading comprehension, namely reading comprehension ability at word-, sentence- and text-level. Text-level in this digital test consists of two subtests (text-level I and text-level II), the other levels consist of one sub-test each.

At word-level, students need to allocate three fitting words to three presented pictures (out of six written words). The pictures show nouns, verbs or adjectives, each being relevant for this age group referring to the ChildLex database (ChildLex – German Children’s Book Corpus: Schröder Würzner, Heister, Geyken, & Kliegl, 2015).

At sentence level, students are asked to pick one out of four presented sentences that matches a picture. For each item at sentence-level, a semantical and a phonological distractor as well as a combination of both are used.

At text-level I, students are presented nonsense-stories about non-existing things, animals or activities. Afterwards, they need to answer two questions referring to these nonsense-stories – one question concerns extracting single information of the stories and one question concerns drawing conclusions. Each question has four answer options, only one being the right one. We used nonsense-stories to minimize the influence of general knowledge and to focus on reading comprehension.

At text-level II, students are presented four texts that use the Maze procedure's task format. In these texts, every seventh word is replaced by a drop-down field and students need to identify one out of three words that fits in the text. Distractors are either syntactical/semantical or graphemical/phonological. The four texts have a similar difficulty index calculated by using an online tool (RATTE – Regensburger Analysetool für Texte: Wild & Pissarek, 2018), namely a gSmog between 3 and 4 (Bamberger & Vanecek, 1984).

The present paper focusses on two studies. Study 1 aimed at the development of the digital reading test and the first item analysis (N= 273 students; Grade 3: N=117; Grade 4: N=156). Study 2 analyses the reliability and validity measures of the digital reading assessment (N=550 students, Grade 3: N=333; Grade 4: N=217; considering second language learners).

Method

The paper consists of two studies, each presenting steps in development and piloting processes of the digital reading test. The test used in the two studies differs in details because it was adapted based on findings in Study 1. Its basic structure, however, is explained above. In Study 1 (data collection: 10/2019-12/2019), we describe two development steps of the test’s first versions. In Development Step 1, we used a version that consisted of 134 items (word: 38, sentence: 24, text I: 16, text II: 56 Maze selections in four texts). We tested seven students in individual settings to gain information about (a) the duration of the subtests and (b) potential shortcomings of their items as well as (c) their digital representation. We asked students to think aloud while solving the items. After this step, we revised several items. In Development Step 2, we collected data in 13 classrooms (N= 273 students; Grade 3: N=117; Grade 4: N=156) to gain information on (a) the items’ difficulty, (b) the items’ discriminatory power and (c) time limits for future speed testing. Students were solving the subtest’s items until 80% of the classroom’s students had finished the corresponding subtest. We went to the classrooms twice (word-level plus text-level I; sentence-level plus text-level II) because in this version, the test was still time consuming. After this step, the subtests were revised by excluding items that did not meet the criteria. In addition, the data gave insight into appropriate duration of each subtest. In Study 2 (data collection: 09/2020-10/2020), we performed the final test version with 550 students to gain information about the test’s (a) validity (convergent and divergent) and (b) reliability (retest-reliability and internal consistency). Convergent validity measures contained teachers’ assessment of the corresponding ability and students’ performance on ELFE II (Lenhard, Lenhard, & Schneider, 2020). This reading test also measures reading ability at word-, sentence- and text-level, however, using a different manner. Divergent validity is calculated referring to teachers’ assessment of students’ mathematical abilities. Additionally, we used a newly developed digital mathematics speed test. Due to COVID-19 measures and a second lockdown in Austria in autumn 2020, we could not test all students twice with the digital reading test to gain knowledge about the retest reliability. However, we managed to collect retest data of 299 students. The data is currently being analysed and results will be presented at the conference.

Expected Outcomes

After Construction Step 1, 19 items were changed (word: 9; sentence: 4; Text I: 3; Text II: 3) and 4 were excluded (word: 3; text I: 1). Changes ranged from distractor words to more in-depth lika re-constructing stories. This led to a test version with 130 items (word: 35, sentence: 24, Text I: 15, Text II: 56). Also, the digital presentation of the test was adapted referring to students’ feedback in their thinking aloud situations. In Construction Step 2, several items did not meet criteria in difficulty and discriminatory power. Moreover, in Construction Step 2, decisions on the tests’ duration for speed testing were made. The duration of the subtests word, sentence and text I were chosen to be 3 minutes each. Text II contained two texts and a duration of 100 seconds was decided for both texts. Thus, a total duration of less than fifteen minutes was reached. Results of Study 2 will reveal more information about the test’s reliability and validity. For reliability, correlations between two different measurement points will be presented (n=299). Additionally, the subtests’ internal consistency in Study 1 and 2 will be discussed. In terms of reliability, we assume a relatively high correlation between the subtests of the digital reading test with the respective subtests of the ELFE II (Lenhard, Lenhard, & Schneider, 2020), with the teachers’ assessment of reading skills and a lower correlation with the teachers’ assessment of mathematics skills and the mathematic test. Results of second language learners’ performance as well as an in-depth error analysis will be presented. The results will be discussed in the light of teachers’ needs for standardized digital assessments to identify students who need support in reading. Following steps considering the digital reading test will be discussed, too.

References

Bamberger, R. & Vanecek, E. (1984). Lesen-Verstehen-Lernen-Schreiben. Die Schwierigkeitsstufen von Texten in deutscher Sprache. Wien: Jugend und Volk, 58f. Breit, S., Bruneforth, M., & Schreiner, C. (Eds.) (2016). Standardüberprüfung 2015 Deutsch/ Lesen/Schreiben, 4. Schulstufe [Educational standards testing 2015 German/reading/writing, 4th grade]. Bundesergebnisbericht. Salzburg: BIFIE. Cheung, A.C.K., & Slavin, R.E. (2012). How features of educational technology applications affect student reading outcomes: A meta-analysis. Educational Research Review, 7, 198-215. Lenhard, A., Lenhard, W., & Schneider, W. (2020). ELFE II - Ein Leseverständnistest für Erst- bis Siebtklässler – Version II. Göttingen: Hogrefe. Mullis, I.V.S., Martin, M.O., Kennedy, A.M., Trong, K.L., & Sainsbury, M. (2009). PIRLS 2011 assessment framework. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College. Perfetti, C.A., Landi, N., & Oakhill, J. (2005). The Acquisition of Reading Comprehension Skill. In M.J. Snowling, & C. Hulme (eds.), The Science of Reading: A Handbook (pp. 227-247). Oxford: Blackwell. Ready, D. D., & Wright, D. L. (2011). Accuracy and Inaccuracy in Teachers' Perceptions of Young Children's Cognitive Abilities: The Role of Child Background and Classroom Context. American Educational Research Journal, 48(2), 335–360. doi:10.3102/0002831210374874 Schroeder, S., Würzner, K. M., Heister, J., Geyken, A., & Kliegl, R. (2015). childLex: A lexical database of German read by children. Behavior Research Methods, 47(4), 1085–1094. https://doi.org/10.3758/s13428-014-0528-1. Wild J. & Pissarek M. (2018). Ratte. Regensburger Analysetool für Texte. Dokumentation. https://www.uni-regensburg.de/sprache-literatur-kultur/germanistik-did/medien/ratte_dokumentation.pdf.

Author Information

Lisa Paleczek (presenting / submitting)

University of Graz

Inclusive Education Unit

Graz

Susanne Seifert (presenting)

University of Graz

Inclusive Education

Graz

Update Modus of this Database

The current conference programme can be browsed in the conference management system (conftool) and, closer to the conference, in the conference app.
This database will be updated with the conference data after ECER.

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance, please use the conference app, which will be issued some weeks before the conference and the conference agenda provided in conftool.
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.