How Do Different Standards Lead To Different Conclusions? A Comparison Between Meta-Analyses Of Two Research Centers

Author(s):

Marta Pellegrini(presenting / submitting)

Conference:

ECER 2017

Network:

27. Didactics - Learning and Teaching

Format:

Paper

Session Information

27 SES 09 B, Theoritical Explorations in Didactics

Paper Session

Time:

2017-08-24

13:30-15:00

Room:

K3.05

Chair:

Gérard Sensevy

Contribution

It has been approximately twenty years since some researchers started to think that education should be more evidence-based (Davies, 1999; Hargreaves, 1996). Later, the need for evaluating the teaching methods and programs’ effectiveness by applying rigorous standards has increased. The aim is to provide to policy makers and educators valid and reliable information (as proven programs) about the efficacy of educational programs for helping them to make decisions (Slavin, 2002; 2008).

For carrying out reviews of research – meta-analysis in particular – different research centers use different standards. Some research centers that develop meta-analysis, for example, are: the U.S. Department of Education’s What Works Clearinghouse (WWC) and the Best Evidence Encyclopedia of Johns Hopkins University in U.S.; the Education Endowment Foundation and the Campbell Collaboration in U.K.

Today, we know that some standards are more reliable than others according to the works of different scholars. If an evidence-based reform wants to have the desired impact on the policies and practice, it is necessary to recognize these issues by previous research, as features can affect effect sizes and statistical significance of the results.

The problem:

If teachers or policy makers want to know which are the most effective educational programs for learning, they encounter meta-analyses with different conclusions. If we see the results from different centers of research, we notice that the effect size for a program is not the same. Furthermore, we see that some studies were included by some meta-analyses, but excluded from others.

Why do the same programs have different efficacy in the meta-analysis of different research centers? It depends on the standards of inclusion that were used. Some researchers have already studied the elements that influence the effect sizes and the statistical significance of the effect. Slavin & Madden (2011) and deBoer, Donker & van der Werf (2014) analyzed how measures of achievement inherent to the experimental treatment and not to the control group produce effect sizes higher than standardized tests. Glass, McGaw & Smith, (1981) and Lipsey & Wilson (1993) underlined that unpublished studies have lower effect sizes than published ones. Cheung & Slavin (2016) studied how some methodological features affect effect sizes of an experiment. The authors showed that effect sizes are roughly twice as large for small-scale trials (less than 250 students), published studies, quasi-experiments and experimenter-made measures, than for large-scale studies, unpublished papers, randomized experiments and independent measures, respectively. Similar conclusions were found by Kjaergard, Villumsen & Gluud (2001) about randomized experiments and matched studies.

This study, therefore, is developed from two elements emerged from the literature in this field: different standards affect effect sizes; teachers need information that is clear and reliable about the efficacy of the programs. Furthermore the study is developed from a practical issue: how can a teacher choose the result from one research center or another?

This study aims to compare the meta-analyses conducted by two different research centers on programs for teaching elementary reading and math. The two research centers are the What Works Clearinghouse and the Best Evidence Encyclopedia of the Center for Research and Reform in Education.

The research question are:

How much do different standards used in meta-analyses lead to different conclusions about the efficacy of the programs in terms of ES?
Which suggestions can we give to a teacher who wants to read the results of a meta-analyses?

Method

In order to analyze how different standards lead to different conclusions, a protocol of analysis for meta-analyses was developed. It is based on the literature that studied the methodological features that affect effect sizes and based on the standards of the two centers of research (WWC, 2015; Baye, Lake, Inns & Slavin, 2016). The first section of the protocol has the purpose to analyze primary studies included in the meta-analyses. The elements of this section of the protocol are: • Outcome measure • Sample size • Duration • Design It will be studied if each methodological features affect effect sizes. The second section of the protocol has the purpose to analyze elements about meta-analyses that are: • Last updating • Calculation of the effect size of the program The rules of Slavin & Madden (2011) will be used to define treatment-inherent or independent measures. The effect size will be calculated for the two measure categories. Sample size information will be extracted from the studies included in the reviews and the studies will be divided into four categories based on the sample size: up to 30, 31-100, 101-250, 250 or more. We will see the difference of the effect sizes between the sample size categories. The duration of the study is analyzed because in brief experiments researchers can create non-replicable conditions and because teachers need evaluations of the whole program and not of a small unit. We will see if the duration is a methodological feature that affect effect size in the meta-analyses of the two research centers. We then analyze the design of the study: randomized controlled trials and quasi experiments. We will see the difference of effect sizes between the two designs. The date of updating of the meta-analyses is an important element for this study because it aims to study which reviews are more reliable and valid to inform educators. For each program, it is analyzed how the two centers of research calculates the effect size of the program – if they compute an average effect size or a weighted by sample size using an inverse variance procedure.

Expected Outcomes

In this moment the protocol of analysis is completed and the analysis of the meta-analyses has already begun. Based on the literature, these are the expected outcomes: • effect sizes are roughly twice as large for small-scale trials and experimenter-made measures, than for large-scale studies, and independent measures, respectively; • most of the studies with duration of less than 12 weeks are quasi-experimental studies with small sample size. The average effect size for brief studies is higher than other studies with duration of more than 12 weeks; • effect sizes are significantly higher in quasi-experiments than in randomized experiments; • if the effect size of each program is calculated by average, it is higher than the effect size weighted by sample size. Each of these conclusions will be discussed by comparing the meta-analyses and the effect sizes of the two centers of research. From the conclusions of this study, recommendations will be drawn up for educators who should choose to read information from different organizations.

References

Baye, A., Lake, C., Inns, A., Slavin, R.E. (2016). Effective Reading Programs for Secondary Students. Best Evidence Encyclopedia, Johns Hopkins University, Baltimore, MD. Cheung, A., & Slavin, R.E. (2016). How Methodological Features Affect Effect Sizes in Education. Educational Researcher, 45(5), 283-292. Davies, P. (1999). What is evidence‐based education?. British journal of educational studies, 47(2), 108-121. de Boer, H., Donker, A.S., & van der Werf, M.P.C. (2014). Effects of the Attributes of Educational Interventions on Students’ Academic Performance: A Meta-Analysis. Review of Educational Research, 84(4), 509-545. Glass, G.V., McGaw, B., & Smith, M.L. (1981). Meta-Analysis in Social Research. Beverly Hill, CA: Sage. Hargreaves, D.H. (1996). Teaching as a research-based profession: possibilities and prospects (p. 7). London: Teacher Training Agency. Kjaergard, L.L., Villumsen, J., & Gluud, C. (2001). Reported Methodological Quality and Discrepancies Between Large and Small Randomized Trials in Meta-Analyses. Annals of internal medicine, 135(11), 982-989. Lipsey, M.W., & Wilson, D.B. (1993). The Efficacy of Psychological, Educational, and Behavioral Treatment: Confirmation from Meta-Analysis. American Psychologist, 48, 1181-1209. Rothstein, H.R., Sutton, A.J., & Borenstein, M. (Eds.). (2006). Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments. John Wiley & Sons. Slavin, R.E. (2002). Evidence-Based Education Policies: Transforming Educational Practice and Research. Educational researcher, 31(7), 15-21. Slavin, R.E. (2008). Evidence-Based Reform in Education: What Will It Take. European Educational Research Journal, 7(1), 124-128. Slavin, R.E. (2013). Overcoming the Four Barriers to Evidence-Based Education. Education Week 32(29), 24. Slavin, R.E., & Madden, N.A. (2011). Measures Inherent to Treatments in Program Effectiveness Reviews. Journal of Research on Educational Effectiveness, 4, 370-380. Slavin, R.E., & Smith, D. (2009). The Relationship Between Sample Sizes and Effect Sizes in Systematic Review in Education. Educational Evaluation and Policy Analysis, 31(4), 500-506. What Works Clearinghouse (2013). Procedures and standards handbook (version 3.0). Washington, DC: Author.

Author Information

Marta Pellegrini (presenting / submitting)

University of Florence

Educational sciences and psychology

Siena

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.