Essential Benefits and Disadvantages of Using Discrete Bayesian Methods in Educational Research

Author(s):

Petri Nokelainen

Conference:

ECER 2009

Network:

9. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 03 C, Testing Theory and Methodology

Paper Session

Time:

2009-09-28

14:00-15:30

Room:

HG, Elise Richter

Chair:

Tobias C. Stubbe

Contribution

In this paper, I will first discuss the typical problems of using parametric frequentistic statistical techniques, such as t-test, to answer educational science research questions. After that I will present the Bayesian modeling approach and discuss about its advances and limitations from educational researcher's point of view. The discussion is based on practical experiences on empirical studies that I have carried out with Bayesian methods during the past ten years. The problem of rational inference under uncertainty has been the subject of considerable attention since the systematic study of the probability theory began in the eighteenth century. Many different theories of inference have been proposed, and there has hardly been a time when inference under uncertainty was not a matter of real controversy. It seems that educational research community (among many other fields using applied statistics) has been largely unaware of such controversies, and used what is known as classical, frequentistic or Gaussian inference (Hastings, 1997). Since 1960’s there has been a steady revival of interest in an alternative way of reasoning with probabilities called Bayesian inference (Berger, 1985, Bernardo & Smith, 2000). Many applied fields including astrophysics (Loredo, 1990), medicine (Smith, Spiegelhalter & Parmar, 1996), econometrics (Zellner, 1971), archaeology (Buck, Cavanagh & Litton, 1996) and political sciences have adopted Bayesian techniques, but the penetration of Bayesian inference into educational research has not been particularly influential. According to Tirri (1999), this is somewhat surprising as quantitative analysis in education exhibits all the features where Bayesian approaches excel: small data sets with many measured issues, emphasis on hierarchical models, models involving latent structures, and data sets with discrete (nominal) values. When an educational researcher wants to study dependencies between observed and/or latent variables, the assumptions for the data may become quite challenging in traditional frequentistic statistical analysis. Examples of such assumptions are the continuous measurement level, multivariate normality and linearity of both the data and phenomena under investigation. Bayesian modeling approach, named after English reverend Thomas Bayes (1701-1761), is a viable alternative to frequentistic statistical techniques addressing all the abovementioned modeling problems. Bayesian theory of probability (e.g., Bernardo & Smith, 2000) is interested in probability of certainty that a given fact or proposition is true. Bayesian approach is often labeled as ”subjective probability”, as its probability values dependent on how much weight we are willing to lay on both the evidence and prior information available.

Method

As this is a theoretical paper, earlier research body and my own empirical studies (and the reports based on them) are the information sources. For example, I have written several comparative methodological articles/chapters on frequentistic and bayesian methods.

Expected Outcomes

The essential benefits of discrete Bayesian methods are summarized as follows: 1) Theoretical minimum sample size is zero; 2) It allows prediction with the data; 3) It answers directly to the research questions as the model is constructed from the data P(M|D); 4) Researcher is able to input a priori (expert) knowledge to the model; 5) It is designed to analyze categorical variables; 6) It is able to analyze both linear and non-linear dependencies between variables. There are also issues applying discrete Bayesian methods that researchers should be aware of: 1) All data is categorized for the analysis. In practice this means that no matter how ’quantitative’ the data originally is (e.g., continuous indicator measured on ratio scale), it is categorized and the order of the classes is destroyed; 2) Small sample size is a fallacy as it leads to reduced power, making Type II error more probable.

References

Abelson, R. P. (1995). Statistics as Principled Argument. Hillsdale, NJ: Lawrence Erlbaum Associates. Albaum, G. (1997). The Likert scale revisited: an alternate version. Journal of the Market Research Society, 39(2), 331-342. Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society, 53, 370-418. Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis. Springer-Verlag: New York. Berger, J. O., & Wolpert, R. L. (1988). The likelihood principle. Second edition. Hayward (CA): Institute of Mathematical Statistics. Bernardo, J., & Smith, A. (2000). Bayesian Theory. New York: John Wiley & Sons. Berry, D. (1996). Statistics - A Bayesian perspective. Pacific Grove, CA: Duxbury Press. Bradley, W. J., & Schaefer, K. C. (1998). The Uses and Misuses of Data and Models: The mathematization of the human sciences. Thousand Oaks: Sage. Brannen, J. (2004). Working qualitatively and quantitatively. In C. Seale, G. Gobo, J. Gubrium, & D. Silverman (Eds), Qualitative Research Practice (pp. 312-326). London: Sage. Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 1-21. Buck, C., Cavanagh, W., & Litton, C. (1996). Bayesian Approach to Interpretating Archaeological Data. New York: John Wiley & Sons. Caskie, G. I. L., & Willis, S. L. (2006). Research Design and Methodological Issues for Adult Development and Learning. In C. Hoare (Ed.), Handbook of Adult Development and Learning (pp. 52-70). New York: Oxford University Press. Champoux, J. E. (1991). A multivariate test of job characteristics theory of work motivation. Journal of Organizational Behavior, 12, 431-446. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Second edition. Hillsdale, NJ: Lawrence Erlbaum Associates. Congdon, P. (2001). Bayesian Statistical Modelling. Chichester: John Wiley & Sons. Filzmoser, P. (2002). Robust factor analysis: methods and applications. In G. A. Marcoulides & I. Moustaki (Eds.), Latent Variable and Latent Structure Models (pp. 153-194). Mahwah, NJ: Lawrence Erlbaum Associates. Fischer, H. (2001). Pierre-Simon Laplace. In C. C. Heyde & E. Seneta (Eds.), Statisticians of the Centuries (pp. 95-100). New York: Springer. Fisher, R. A. (1935/1971). The design of experiments. Eighth edition. Hafner: New York. Fisher, R. A. (1956/1973). Statistical Methods and Scientific Inference. Third edition. Hafner: New York. Gigerenzer, G. (2000). Adaptive thinking. New York: Oxford University Press. Gigerenzer, G., Krauss, S., & Vitouch, O. (2004). The null ritual: What you always wanted to know about significance testing but were afraid to ask. In D. Kaplan (Ed.), The SAGE handbook of quantitative methodology for the social sciences (pp. 391-408). Thousand Oaks: Sage. Gill, J. (2002). Bayesian methods. A Social and Behavioral Sciences Approach. Boca Raton: Chapman & Hall/CRC. Gobo, G. (2004). Sampling, representativeness and generalizability. In C. Seale, J. F. Gubrium, G. Gobo, & D. Silverman (Eds.), Qualitative Research Practice (pp. 435-456). London: Sage. Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share with their teachers? Methods of Psychological Research Online, 7(1), 1-20. Retrieved July 3, 2007, from http://www.mpr-online.de/issue16/art1/haller.pdf Hastings, K. J. (1997). Probability and Statistics. Reading, MA: Addison-Wesley. Heckerman, D., Geiger, D., & Chickering, D. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3), 197-243. Hilario, M., Kalousisa, A., Pradosa, J., & Binzb, P.-A. (2004). Data mining for mass-spectra based diagnosis and biomarker discovery. Drug Discovery Today: BIOSILICO, 2(5), 214-222. Hoijtink, H., & Klugkist, I. (2007). Comparison of Hypothesis Testing and Bayesian Model Selection. Quality & Quantity, 41, 73-91. Hsu, W. H. (2004). Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Information Sciences, 163(1-3), 103-122. Hu, L., & Bentler, P. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 76-99). Thousand Oaks: Sage. Huberty, C. (1994). Applied Discriminant Analysis. New York: John Wiley & Sons. Hyland, T. (1993). Meta-competence, metaphysics and vocational expertise. Competence Assessment, 20, 22-25. Hyvärinen, A., & Oja, E. (2000). Independent Component Analysis: Algorithms and Applications. Neural Networks, 13(4-5), 411-430. Jackson, S. (2006). Research Methods and Statistics. A Critical Thinking Approach. Second edition. Belmont, CS: Thomson. Johnson, B., & Christensen, L. (2004). Educational research: Quantitative, qualitative, and mixed approaches. Second edition. Boston, MA: Pearson Education Inc. Johnson, D. H. (1995). Statistical Sirens: The Allure of Nonparametrics. Ecology, 76(6), 1998-2000. Johnson, D. R., & Creech, J. C. (1983). Ordinal Measures in Multiple Indicator Models: A Simulation Study of Categorization Error. American Sociological Review, 48, 398-407. Jöreskog, K. G. (2003). Structural Equation Modeling with Ordinal Variables using LISREL. Retrieved February 13, 2005, from http://www.ssicentral.com/lisrel/ordinal.htm Lindley, D. V. (1971). Making Decisions. London: Wiley. Lindley, D. V. (2001). Harold Jeffreys. In C. C. Heyde & E. Seneta (Eds.), Statisticians of the Centuries (pp. 402-405). New York: Springer. Marini, M., Li, X., & Fan, P. (1996). Characterizing Latent Structure: Factor Analytic and Grade of Membership Models. Sociological Methodology, 1, 133-164. Miettinen, M., Kurhila, J., Nokelainen, P., & Tirri, H. (2006). Supporting Open-Ended Discourse with Transparent Groupware. International Journal of Web Based Communities, 2(1), 17-30. Murphy, K. R., & Myors, B. (1998). Statistical Power Analysis. A Simple and General Model for Traditional and Modern Hypothesis Tests. Mahwah, NJ: Lawrence Erlbaum Associates. Muthén, B. O. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 48-65. Muthén, B. O. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage. Muthén, B. O., & Kaplan D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-189. Muthén, L. K., & Muthén, B. O. (2001). Mplus user's guide. Second edition. Los Angeles, CA: Muthén & Muthén. Myllymäki, P., Silander, T., Tirri, H., & Uronen, P. (2001). Bayesian Data Mining on the Web with B-Course. In N. Cercone, T. Lin, & X. Wu (Eds.), Proceedings of The 2001 IEEE International Conference on Data Mining (pp. 626-629). IEEE Computer Society Press. Myllymäki, P., Silander, T., Tirri, H., & Uronen, P. (2002). B-Course: A Web-Based Tool for Bayesian and Causal Data Analysis. International Journal on Artificial Intelligence Tools, 11(3), 369-387. Myllymäki, P., & Tirri, H. (1998). Bayes-verkkojen mahdollisuudet [Possibilities of Bayesian Networks]. Teknologiakatsaus 58/98. Helsinki: TEKES. Neapolitan, R. E., & Morris, S. (2004). Probabilistic Modeling Using Bayesian Networks. In D. Kaplan (Ed.), The SAGE handbook of quantitative methodology for the social sciences (pp. 371-390). Thousand Oaks, CA: Sage. Nokelainen, P., Miettinen, M., Kurhila, J., Silander, T., & Tirri, H. (2002). Optimizing and profiling users online with Bayesian probabilistic modeling. In Proceedings of the International Networked Learning Conference of Natural and Artifical Intelligence Systems Organization, Berlin: ICSC-NAISO Academic Press. Nokelainen, P., Ruohotie, P., & Tirri, H. (1999). Professional Growth Determinants: Comparing Bayesian and Linear Approaches to Classification. In P. Ruohotie, H. Tirri, P. Nokelainen, & T. Silander (Eds.), Modern Modeling of Professional Growth, vol. 1 (pp. 85-120). Hämeenlinna: RCVE. Nokelainen, P., Silander, T., Ruohotie, P, & Tirri, H. (2003, August). Investigating Non-linearities with Bayesian Networks. Paper presented at the meeting of the American Psychological Association, Toronto, Canada. Nokelainen, P., Silander, T., Ruohotie, P., & Tirri, H. (2007). Investigating the Number of Non-linear and Multi-modal Relationships Between Observed Variables Measuring Growth-oriented Atmosphere. Quality & Quantity, 41(6), 869-890. Nokelainen, P., & Tirri, H. (2004). Bayesian Methods that Optimize Cross-cultural Data Analysis. In J. R. Campbell, K. Tirri, P. Ruohotie, & H. Walberg (Eds.), Cross-cultural Research: Basic Issues, Dilemmas, and Strategies (pp. 141-158). Hämeenlinna: RCVE. Nokelainen, P., Tirri, K., Campbell, J. R., & Walberg, H. (2004). Cross-cultural Factors that Account for Adult Productivity. In J. R. Campbell, K. Tirri, P. Ruohotie, & H. Walberg (Eds.), Cross-cultural Research: Basic Issues, Dilemmas, and Strategies (pp. 119-139). Hämeenlinna: RCVE. Nokelainen, P., Tirri, K., & Merenti-Välimäki, H.-L. (2007). The Influence of Self-attributions and Parental Attitude to the Development of Mathematical Talent. Gifted Child Quarterly, 51(1), 64-81. Pearl, J. (2000b). Causality. Models, Reasoning, and Inference. Cambridge: Cambridge University Press. Silander, T., & Tirri, H. (1999). Bayesian Classification. In P. Ruohotie, H. Tirri, P. Nokelainen, & T. Silander (Eds.), Modern Modeling of Professional Growth , vol. 1 (pp. 61-84). Hämeenlinna: RCVE. Silander, T., & Tirri, H. (In press). B-Course: Issues in designing a Web Service for Bayesian Data Analysis. Manuscript submitted for publication. Tirri, H. (1997) Plausible Prediction by Bayesian Interface. Department of Computer Science. Series of Publications A. Report A-1997-1. University of Helsinki. Tirri, H. (1999). What the heritage of Thomas Bayes has to offer for modern educational research? In P. Ruohotie, H. Tirri, P. Nokelainen, & T. Silander (Eds.), Modern Modeling of Professional Growth, vol. 1 (pp. 37-59). Hämeenlinna: RCVE. de Vaus, D. A. (2004). Research Design in Social Research. Third edition. London: Sage. de Vellis, R. F. (2003). Scale Development. Theory and Applications. Second edition. Thousand Oaks, CA: Sage. Zumbo, B. D., & Rupp, A. A. (2004). Responsible modeling of measurement data for appropriate inferences: Important advances in reliability and validity theory. In D. Kaplan (Ed.), Handbook of quantitative methodology for the social sciences (pp. 73-92). Newbury Park, CA: Sage Press.

Author Information

Petri Nokelainen

University of Tampere

Department of Education

Tuulos

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.