RALSA: The R Analyzer for Large-Scale Assessments

Author(s):

Plamen Mirazchiyski(presenting / submitting)

Conference:

ECER 2021

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 06 A, Tackling Methodological Challenges in Analyzing International-comparative Large-scale Assessment Data

Paper Session

Time:

2021-09-07

11:00-12:30

Room:

n/a

Chair:

Rolf Strietholt

Contribution

The international large-scale assessments and surveys (ILSAs) have complex sampling and assessment designs. ILSA use multistage stratified cluster sampling design where the schools are sampled at the first stage and students are sampled at the second one. The probability of first-stage sampling is proportional to the size of the primary units (PPS) (OECD, 2017a, 2017b; Tieck, 2020a, 2020b). ILSAa also use complex assessment design with multiple matrix rotation of blocks of items (and complex tasks) across multiple booklets/combinations, linking consecutive booklets through common blocks (Fraillon, 2020; OECD, 2017c). The design issues have to be taken into account when analyzing ILSAs’ data.

This goal of this paper is to presents a new software tool, the R Analyzer for Large-Scale Assessments (RALSA) for analyzing data from various ILSAs, taking into account the complex sampling and assessment studies’ designs. Other software packages for analyzing the data from large-scale assessments and surveys exist as well, like the IEA’s IDB Analyzer, WesVar and some R packages like BIFIEsurvey, intsvy and EdSurvey. RALSA, however, possesses some unique features which distinguish it from other software solutions. RALSA converts the originally provided SPSS (or text) data into native R data sets. The converted data sets also contain the user-defined missing values for the variables, which is different from the typical way R handles the missing data. RALSA also has the capability to recognize the study, its cycle and the available respondent types to select the appropriate design variables and apply the pertinent computational routines for the study in scope. Further, the package has a graphical user interface which eases the analysis for users with limited technical skills. Last, but not least, RALSA has a comprehensive output system which exports the results in MS Excel workbook with multiple embedded sheets (estimates, analysis information, model statistics, and the used analysis syntax). The entire package, including the graphical user interface and the output system, was built entirely in R without relying on any other platform or programming language. The package was built for user experience with a flexible design and architecture which permit quick addition of new studies and functionality. RALSA is useful for all European educational researchers and analysts worldwide.

Currently, the package can process and analyze data from all cycles of the following studies:

CivED;
ICCS;
ICILS;
RLII;
PIRLS (including PIRLS Literacy and ePIRLS);
TIMSS (including TIMSS Numeracy, eTIMSS will be added with the upcoming release of TIMSS 2019);
TiPi (TIMSS and PIRLS joint study);
TIMSS Advanced;
SITES;
TEDS-M;
PISA;
TALIS; and
TALIS Starting Strong Survey (a.k.a. TALIS 3S).

The following data preparation and analysis functionality is supported:

Prepare data for analysis
- Convert data (SPSS, or text in case of PISA prior 2015)
- Merge study data files from different countries and/or respondents
- View variable properties (name, class, variable label, response categories/unique values, user-defined missing values)
- Recode variables
Perform analyses (more analysis types will be added in future)
- Percentages of respondents in certain groups and averages on variables of interest, per group
- Percentiles of variables within groups of respondents
- Percentages of respondents reaching or surpassing benchmarks of achievement
- Correlations (Pearson or Spearman)
- Linear regression
- Binary logistic regression

RALSA also introduces a graphical user interface for the users with limited technical skills. It is written entirely in R without relying to any external platform or programming language. RALSA can work on any operating system where R can be installed (e.g. Linux, MacOS and Windows).

Method

RALSA has a flexible design where multiple common functions and objects are shared by all data preparation and analysis functions in the package. These common functions take care for consistency of the computations, compute variance terms and standard errors, reshape and assemble the outputs, and export the final outputs into MS Excel workbook with multiple sheets. This has a number of advantages: • Develop a function for certain operation once, call it and use it from any analysis function; • If any computational routine, used by all analysis functions, needs to be updated, this is done just • once an all analysis functions take advantage; • Consistency of all computations different functions have in common; • Avoids code repetition and minimizes the risks for mistakes and inconsistencies; and • Makes the time for new developments much shorter. The last point from above is especially important. RALSA will continue growing in future, adding more and more analysis types, quickly and consistently due to its design. RALSA converts the originally provided by the studies SPSS data sets into R’s native .Rdata files. In case of OECD’s PISA cycles prior to 2105, it uses the text data files and the accompanying control SPSS syntaxes to convert the data. The conversion also adds the user-defined missing values as attribute to each variable, different than the native R where only one type of missing (system missing) value exists. In addition, it adds the study, cycle and respondent type as an attribute to the data set. Later, these attributes will be used to appropriate weighing variable, resampling technique (JRR or BRR) and the pertinent estimation procedures fir the study in scope. Executing he following syntax will convert the data from ICILS 2018 for Denmark, France and Italy. lsa.convert.data(inp.folder = "C:/ICILS_2018_IDB", ISO = c("DNK", "FRA", "ITA"), out.folder = "C:/Converted") The following syntax will merge the student and school converted background data files for the countries from above. lsa.merge.data(inp.folder = "C:/Converted", file.types = list(bcg = NULL, bsg = NULL), out.file = "C:/Merged/Merged.RData") The following syntax will compute the percentage of students in schools where the principal estimated the percentage of students coming from economically disadvantaged homes and their average computer and information literacy scores. lsa.pcts.means(data.file = "C:/Merged/Merged.RData", split.vars = "IP2G08BB", PV.root.avg = "PV#CIL") Note how parsimonious the syntax is, especially the last one. The MS Excel output file is automatically written as “Analysis.xlsx” under the R working directory and automatically open.

Expected Outcomes

ILSAs have became one of the main drivers of policy making in education. Many of the European educational systems use the findings provided by the ILSAs’ data for initiating new or guiding existing reforms in education. Due to their complex sampling and assessment designs, however, analysis needs using special techniques to obtain correct population estimates. This, in turn, requires availability of software to perform the computations. This paper presents a newly developed analysis tool which handles automatically all analysis issues of ILSAs by study (and, in some cases, even by cycle) and respondent type to ensure the correct computations. This automatic handling of the analysis issues benefits the analysts by preventing them making common mistakes in analyzing ILSAs’ data. RALSA will continue growing in future, adding more and more functionality. RALSA is, and will remain, free of charge and open-source.

References

Fraillon, J. (2020). ICILS 2018 test development. In J. Fraillon, J. Ainley, W. Schulz, T. Friedman, & D. Duckworth (Eds.), IEA International Computer and Information Literacy Study 2018: TECHNICAL REPORT (pp. 11–29). IEA. OECD. (2017a). Sample Design. In PISA 2015 Technical Report (pp. 65–87). OECD. OECD. (2017b). Survey Weighting and the Calculation of Sampling Variance. In PISA 2015 Technical Report (pp. 115–126). OECD. OECD. (2017c). Test design and test development. In PISA 2015 Technical Report (pp. 29–55). OECD. Tieck, S. (2020a). Sampling design and implementation. In J. Fraillon, J. Ainley, W. Schulz, T. Friedman, & D. Duckworth (Eds.), IEA International Computer and Information Literacy Study 2018: TECHNICAL REPORT (pp. 59–78). IEA.

Author Information

Plamen Mirazchiyski (presenting / submitting)

Educational Research Institute

Center for Applied Epistemology

Ljubljana

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.