Session Information
12 SES 09 A, Paper Session - Research Data and Open Science
Paper Session
Contribution
Nowadays, many studies in the field of empirical educational research assess competences in a technology-based way or collect data with the intent to inform about individual processes of learning and test-taking. Log data is one of the kinds of data that can be obtained in such studies. Together with timestamps, it includes interaction events, such as starting and ending a session, clicking and moving the mouse or clicking a radio-button, which are recorded and stored in log files. Based on log data, a number of indicators reflecting individual behaviour (e.g., response times, number of interactions with a digital environment) can be derived that augment the analysis of decisions made during a computer-based assessment and that potentially provide a deeper understanding of problem solving strategies beyond test scores. For example, log data can provide information about the test-taking effort (Wise & Kong 2005), aberrant response behaviour (van der Linden & Guo 2008), inter-individual differences in the speed-ability compromise (Goldhammer & Kroehne, 2014), or the invested time on a task (Goldhammer et al. 2014). Moreover, the evaluation and further development of the assessment tool in use can benefit from analysing log data.
Taken together, log data have the potential to address manifold substantial research questions in terms of individual solution behaviour and related cognitive processes, and can be used to enhance the validation of score interpretations (Goldhammer & Zehner, 2017; Kane & Mislevy, 2017). Following this approach, log data cannot be considered as collateral data or collateral information any longer, rather it becomes an object of (educational) research itself. However, dealing with log data is still a new and challenging task for researchers and for the infrastructure of research data in educational sciences. Following the FAIR-principles — that is, making (log) data findable, interoperable, assessable and reusable — reveals a lack of standardisation in the current practice. An early approach to the documentation of log data is realised for the PIAAC study by the OECD (see https://piaac-logdata.tba-hosting.de/). Nevertheless, this approach follows specific needs of internal logics, available infrastructure and resources. Although an IEEE standard for event stream data exists (XES), it is still not approved for storing log data in the community. The present contribution addresses this deficiency and reflects on challenges and solutions to archive and provide log data for re-use issues in the context of educational research.
The understanding and reusability of log data heavily depends on the documentation of meta-data and — even more important — on the availability of the assessment tools used to be able to interpret the interactions stored in the log data. Knowing how a button click can be interpreted requires knowledge about the structure of the assessment instrument and about the included tasks. In the case of PIAAC, the documentation of log data and the corresponding assessment instruments is limited to the restriction of the item content. Only released items are assessable without limitations and open for interpretation of log events. The strong link between the documentation of the log data and the corresponding assessment tool has to be taken into account when log data is provided. This paper seeks to fill this blank by taking up approaches for the documentation of (1) technical implementations of the assessment platform and test procedure, (2) the documentation of items and the item structure, (3) the documentation of log events and (4) the documentation of para data to enhance the scientific use of log data.
Method
The test procedure has to be described in detail, including a list of elements, the navigation between and within entities, the test assembly and the booklet definitions. Furthermore, the technical implementations of the assessment platform are of interest, requiring a list of events (describing all expected event types) and so called event-specific data which describes the specific structure of events (Kroehne & Goldhammer 2018). Depending on the degree of interactivity and complexity of the tasks, static hardcopies like screenshots are not sufficient to enable the interpretation of the log data. To deal with this thread, mock items can be used as an alternative. Mock items allow the interpretation of interactions by replacing sensitive contents of the task by placeholders without touching the functions of interactive items. As we mentioned in the beginning, a comprehensive, standardised and approved format for the facilitated archiving of log data in research data repositories is still missing. For this reason the universal log format was developed. By transforming log data into a basic data structure, the universal log format separates log data into multiple rectangular datasets, mainly according to their event type. The person identifier, timestamp and the name of the element are repeated in all resulting data tables to simplify processing of the resulting log data. To achieve a lossless conversion of existing log data into the universal log format and to maintain the internal structure, the multiple datasets by event type contain references to their parent and an additional root dataset contains basically an identifier for each event (and for each person). A rectangular dataset for each hierarchy of the necessary structure is created, where the internal relationships are stored as well with references to the superior dataset. In short, the nested data structure that is created by the event specific data is mapped to rectangular datasets using techniques from relational databases. Another remaining challenge is the documentation of para data. Para data depict the conditions of the implementation such as screen size or the device type that might be relevant for secondary analyses or the comparison of data collections using the identical instrument and should be included into the documentation.
Expected Outcomes
As seen above, archiving and providing log data is an emerging field in educational sciences. In our talk, we will present solutions for the documentation of log data and the corresponding assessment tools. These first approaches taken are a starting point to increase the findability, availability, interoperability and reusability of log data for the educational research community. The development of an appropriate meta data scheme is necessary to describe scientific use files of log data in an appropriate way and to make them findable and available when they are stored and provided by a research data repository. The interoperability is ensured by using the universal log format to store log data. To foster the reusability of log data for secondary analyses, the universal log format opens up a perspective for applying new indicators on lossless archived log data files. Additionally, identifying potentials for the analyses of log data is closely related to the documentation of derived indicators and touches theoretical psychometric issues of log data. An appropriate documentation of indicators will foster the exchange of indicators in the community and expand the value of FAIRly archived log data. Finally, we can conclude, that log data are speaking an international language and hold up great potential for further analyses and discussions of open data in educational sciences.
References
Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608–626. doi:10.1037/a0034716 Goldhammer, F., & Kroehne, U. (2014). Controlling Individuals’ Time Spent on Task in Speeded Performance Measures: Experimental Time Limits, Posterior Time Limits, and Response Time Modeling. Applied Psychological Measurement, 38(4), 255–267. doi:10.1177/0146621613517164 Goldhammer, F. & Zehner, F. (2017). What to Make Of and How to Interpret Process Data, Measurement: Interdisciplinary Research and Perspectives, 15 (3-4), 128-132. doi: 10.1080/15366367.2017.1411651 Kane, M. T., & Mislevy, R. (2017). Validating score interpretations based on response processes. In K. Ercikan & J. W. Pellegrino (Eds.), Validation of score meaning for the next generation of assessments (11–24). New York, NY: Routledge Kroehne, U., & Goldhammer, F. (2018). How to conceptualize, represent, and analyze log data from technology-based assessments? A generic framework and an application to questionnaire items. Behaviormetrika. doi:10.1007/s41237-018-0063-y van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73(3), 365–384. Wise, S. L., & Kong, X. (2005). Response Time Effort: A New Measure of Examinee Motivation in Computer-Based Tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.