Session Information
99 ERC SES 06 M, Research in Higher Education
Paper Session
Contribution
The ‘Big Data’ era has dramatically increased the availability of documents written and stored digitally. While these empirical materials represent a formidable object of analysis, they remain underused in education research. In this paper, I will show that text mining – a set of computerized methods for extracting and quantifying information from large textual databases – offers promising prospects for understanding inequalities of access to higher education. More specifically, I will highlight how such methodology can be used to complement results in more traditional research designs.
Using a corpus of 16,000 college admission essays from French applicants to pre-medical studies at a Parisian university in 2020, I will argue that the application of text mining makes it possible to reveal differentiated socializations as well as to explain unequal educational and professional outcomes according to gender, social class, academic achievement and school environment. Despite the growing popularity of “holistic” admissions processes in higher education, I will thus insist that qualitative application materials like personal statements are no exceptions to biases produced by social inequalities.
First, I will underline the strategic differences in how students introduce themselves: writing style, motivations and qualities put forward for being a good medical student, and narratives about personality and past experiences. I will explain that these are “traces” of an unequal access to information and guidance about higher education and of variations in socializations and representations. Second, I will analyze the disparities in terms of how students project themselves into the future: degrees of precision of the professional project and expected health specializations. In particular, I will highlight that tastes and preferences for academic and career fields are already strongly predetermined before enrollment in postsecondary education.
Since this research is part of a larger mixed-methods project on pre-medical studies, I will finally spend time discussing the contributions, complementarities and limits of text mining compared to other more traditional materials I used (surveys, ethnographic observations, interviews) when studying students’ higher education choices. Notably, I will talk about the importance of interdisciplinarity in educational research and how ‘Big Data’ technologies might offer new research pathways in the future.
Method
This paper is based on a larger mixed-methods research project that seeks to understand inequalities of access to pre-medical studies in France in the context of a 2020 national health studies reform. Across student surveys, ethnographic observations of university open doors and higher education fairs, and interviews with students and parents, this research explores what motivates individuals to apply for pre-medical studies, what type of information they seek and how they get such information, and how they prepare their college applications. Within this research, I am conducting a case study of a Parisian university that granted me access to the entire applicant pool data of its pre-medical course. The dataset consists of sociodemographic information, high school grades and high school teachers’ remarks, and admissions essays for all 16,000 applicants. I also know the ranking of each applicant based on the evaluation of the university admissions officers. Note that this type of data is pretty rare for a researcher to get access to. For this present proposal, I will explore admissions essays using text mining. Since I suppose not everyone in my audience will be familiar with this methodology, I will try to present a large variety of different techniques that I used, namely text extraction, word clouds, bag-of-words and document classification.
Expected Outcomes
This paper first contributes to the literature on inequalities of access to higher education. “Holistic” admissions processes and the inclusion of more qualitative application materials like personal statements and interviews often have been presented as solutions to fight against social inequalities. However, I will show that, in the case of personal statements, writing skills are not equally distributed across the population. In fact, it is easier for some students to express themselves according to expected standards. In addition, students do not all have access to the same quantity and quality of information. Second, my proposal aims to shed light on research opportunities offered by ‘Big Data’ methodologies like text mining. Although people leave more and more “traces” that can be numerically investigated, I will argue that contextualization of such data is of critical importance. In this way, I will show that other traditional research methods remain decisive when interpreting text mining results. Finally, I will emphasize that text mining is an innovative tool for comparative qualitative research, and hence qualitative researchers interested in comparative education at the European level can certainly benefit from the large sample size possibilities offered by text mining.
References
Alvero, A. J., Arthurs, N., Antonio, A. L., Domingue, B. W., Gebre-Medhin, B., Giebel, S., & Stevens, M. L. (2020, February). AI and Holistic Review: Informing Human Reading in College Admissions. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 200-206). Cointet, J. P., & Parasie, S. (2018). Ce que le big data fait à l’analyse sociologique des textes. Revue française de sociologie, 59(3), 533-557. Demazière, D., et al. (2006). Analyses textuelles en sociologie – Logiciels, méthodes, usages. Presses universitaires de Rennes, coll. « Didact Méthodes ». Mützel, S. (2015). Facing Big Data: Making sociology relevant. Big Data & Society. https://doi.org/10.1177/2053951715599179.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.