Why Are School Self-Evaluation Instruments Not Capturing What They Intend To? A Problem Analysis Of Respondents’ Cognitive Processes.

Author(s):

Jerich Faddar(presenting / submitting)Jan Vanhoof Sven De Maeyer

Conference:

ECER 2016

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 05 C, External School Evaluations and School Self-evaluations

Paper Session

Time:

2016-08-24

13:30-15:00

Room:

NM-F107

Chair:

Jana Poláchová Vaštatková

Contribution

Over the past decades school self-evaluation (SSE) has gained a prominent position in many educational systems in Europe and beyond (e.g. O’Brien, McNamara, & O’Hara, 2015). SSE can be described as a process, whereby well-chosen participants systematically describe and judge the school’s functioning in order to make decisions or adopt initiatives within the framework of school development (Vanhoof & Van Petegem, 2010).

In order to describe organisational characteristics at process-level, often staff members are asked to fill in a questionnaire (MacBeath, Schratz, Meuret, & Jakobsen, 2000). This method, where staff members (i.e. lower-level units) are providing information on the school (i.e. a higher-level unit), accounts for multi-level thinking. The formulation of questionnaire items can differ in design and be appropriate to overcome such a multi-level context (Chen, Mathieu, & Bliese, 2004). A consensus and a referent-shift design are commonly used. The consensus design starts from the perspective of an individual making statements on collective properties (e.g. “I have a clear view on the job descriptions of others in the school”) and where the responses are aggregated onto the organisational level. The latter design tends to capture organisational characteristics by asking respondents to make statements on the organisation as a whole (e.g. “In this school one has a clear view on the job descriptions of others in the school”). Notwithstanding the frequent use of questionnaires, literature has already pointed to several problems that might be lurking beneath the surface (Groves et al., 2009). This raises a fundamental concern about the validity of the results from SSE questionnaires.

One crucial element in obtaining valid SSE results, is how items are cognitively processed by respondents (O'Muircheartaigh, 1999). Cognitive theories distinguish different stages during the processing of items which conceal an interplay between the items and the respondents’ memory (Karabenick et al., 2007). First, respondents have to be able to read and interpret the item. Several aspects are important in that regard; semantics, syntax and pragmatics (Lenzner, Kaczmirek, & Lenzner, 2010; Tourangeau, Rips, & Rasinski, 2000). Secondly, respondents have to start retrieving relevant information from their memory (Karabenick et al., 2007). Lastly, respondents are expected to generate a response, based on the preceding cognitive stages. The extent to which a respondent performs the cognitive processes of interpretation, elaboration and response in line with how the instrument developer intended them, is referred to as cognitive validity (Karabenick et al., 2007).

Despite methodological concerns, users of SSE questionnaires seem to rather readily pass over the issue of cognitive validity, leading to a collective glossing over with regard to this issue. Up till now, it is unknown what problems can trigger cognitive processes that lead to (partially) cognitively invalid results of SSE questionnaires. These insights can be beneficial in the identification of cruxes for the improvement of existing instruments, but also for the development of new instruments. Altogether, the goal of this study and the identification of these possible flaws, is to increase the chance that items are cognitively valid processed as an important leverage for valid SSE results. To gain more insight in all this, this study focuses on the following research questions:

What problems can be identified during the cognitive stages of interpretation, elaboration and response in the answering process which hamper the cognitive validity of SSE results?
What problems can be identified for the particular case of referent-shift design items during the cognitive stages of respondents’ answering process which hamper the cognitive validity of SSE results?

Method

A qualitative approach was appropriate given the explorative nature of the research questions. Cognitive interviews, which are suitable and commonly-used to unfold underlying cognitive processes (Ericsson & Simon, 1993; Presser et al., 2004; Willis, 2005), were conducted with 20 participants from 4 primary schools in Flanders (Belgium). A hybrid model of cognitive interviews was applied, meaning that both a think-aloud protocol and a systematic probing technique were used (Collins, 2003). Items of two exemplary scales of a SSE instrument, which had been put in a consensus and a referent-shift design, were adopted during the cognitive interviews. Data consist of 400 observations; 20 participants verbalised their cognitive process on 20 items. The analysis of the data is performed in two stages of coding. First, all observations are coded for their degree of cognitive validity for each of the three cognitive stages, resulting in 1200 coding units. This process was done by means of cognitive validity criteria which were developed for expressing the instrument developers’ intention. For every observation a cognitive validity rating was allocated ranging from ‘cognitively valid’, over ‘partially cognitively valid’ to ‘cognitively invalid’. Next, a content analysis, by adopting a deductive and inductive coding approach, was performed on the data which obtained a ‘partially cognitively valid’ or ‘cognitively invalid’ rating (Krippendorff, 2012).

Expected Outcomes

Different problems arise during the different cognitive stages of the answering process. On a semantic level respondents are confronted with unfamiliar or unknown concepts. On a syntactic level they have problems with difficult phrases, and on pragmatic level they make a restricted or a divergent interpretation. Furthermore, respondents have problems interpreting items at an appropriate referent level. They tend to refer to another level than the one which is asked for. During the elaboration stage data indicate several problems at the content-level. Respondents are retrieving information which is not or partly out of scope of what the instrument developers are searching for. Another element is that the retrieved information does not fully cover all aspects of certain concepts related to the collective property in the item. Next, it is found that some respondents are relying on non-factual information. Furthermore, it turns out that respondents may elaborate on items by mistaking the appropriate referent level. While formulating responses, respondents make inadequately use of the predefined answer options. They make use of the don’t know option for example, while they do have relevant information on the topic under review. The same response option is used when respondents do not arrive at interpreting the item, while it is intended only to be considered when respondents do not have relevant information on the topic. Findings of this study align with results from earlier research within the broader field of survey methodology (Tourangeau et al., 2000). The aspect of multi-level thinking is an import addition to this field, and in particular for SSE. Next to suggestions for the development of new SSE instruments, the paper makes recommendations for current SSE practices and how SSE results should be interpreted. Furthermore, suggestions are made for the conduct of future research.

References

Chen, G., Mathieu, J. E., & Bliese, P. D. (2004). A framework for conducting multi-level construct validation. In F. J. Yammarino & F. Dansereau (Eds.), Multi-level Issues in Organizational Behavior and Processes (Vol. 3, pp. 273-303). The Netherlands: Elsevier Ltd. Collins, D. (2003). Pretesting survey instruments: An overview of cognitive methods. Quality of Life Research, 12(3), 229-238. doi: 10.1023/a:1023254226592 Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (Revised Edition ed.). Cambridge, Massachusetts: MIT-press. Groves, R. M., Fowler, F. J. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey methodology. New Jersey: John Wiley & Sons. Karabenick, S. A., Woolley, M. E., Friedel, J. M., Ammon, B. V., Blazevski, J., Bonney, C. R., . . . Kelly, K. L. (2007). Cognitive Processing of Self-Report Items in Educational Research: Do They Think What We Mean? Educational Psychologist, 42(3), 139-151. doi: 10.1080/00461520701416231 Krippendorff, K. (2012). Content analysis: An introduction to its methodology. Los Angeles, CA: Sage. Lenzner, T., Kaczmirek, L., & Lenzner, A. (2010). Cognitive burden of survey questions and response times: A psycholinguistic experiment. Applied Cognitive Psychology, 24(7), 1003-1020. doi: 10.1002/acp.1602 MacBeath, J., Schratz, M., Meuret, D., & Jakobsen, L. (2000). Self-evaluation in European schools: A story of change. London: RoutledgeFalmer. O'Muircheartaigh, C. (1999). CASM: Successes, Failures, and Potential. In M. G. Sirken, D. Herrmann, S. Schechter, N. Schwarz, J. M. Tanur & R. Tourangeau (Eds.), Cognition and Survey Research (pp. 39-63). New York: Wiley & Sons, Inc. O’Brien, S., McNamara, G., & O’Hara, J. (2015). Supporting the consistent implementation of self-evaluation in Irish post-primary schools. Educational Assessment, Evaluation and Accountability, 1-17. doi: 10.1007/s11092-015-9218-5 Presser, S., Couper, M. P., Lessler, J. T., Martin, E., Martin, J., Rothgeb, J. M., & Singer, E. (2004). Methods for Testing and Evaluating Survey Questions. The Public Opinion Quarterly, 68(1), 109-130. doi: 10.2307/3521540 Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The psychology of survey response. Cambridge, UK: Cambridge University Press. Vanhoof, J., & Van Petegem, P. (2010). Evaluating the quality of self-evaluations: The (mis)match between internal and external meta-evaluation. Studies in Educational Evaluation, 36(1–2), 20-26. doi: http://dx.doi.org/10.1016/j.stueduc.2010.10.001 Willis, G. B. (2005). Cognitive interviewing. A Tool for Improving Questionnaire Design. London: SAGE Publications.

Author Information

Jerich Faddar (presenting / submitting)

University of Antwerp

Training and Education Sciences

Antwerpen

Jan Vanhoof

University of Antwerp, Faculty of Social Sciences, Dept. of Training and Education Sciences, Belgium

Sven De Maeyer