Five flavors of Bayesian data analysis for K-12 learners
An NSF-funded project in its second year (CREDIBLE) aims to design for and support teachers and learners to analyze data in transformative ways. A part of this approach is emphasizing Bayesian reasoning and Bayesian data analysis. This approach emphasizes starting with what you know (a prior) and updating what you know probabilistically. In that way, it’s very simple—far more so than the traditional statistics emphasizing null hypothesis statistical testing (NHST).
As a detour, NHST uses characteristics of a sample of data to estimate the probability of observing that sample, given infinite repeated samples from a population associated with a null hypothesis (typically, that there is no relationship or difference with respect to some parameter(s), like a correlation or a mean-level difference between two variables). That is, a p-value < .05 tells us that — if the null hypothesis were true — this sample or one that is even more different from the null hypothesis would occur in 5% or fewer cases. Yeah, it's a bit confusing!
In contrast, a Bayesian approach starts with a prior belief about a hypothesis (the prior) and then updates that belief given the data (the likelihood), resulting in a posterior distribution that represents the probability of the hypothesis given the data. One benefit is that Bayesian hypotheses are not limited to null hypotheses; instead, they can reflect our specific ideas about what we think is happening (e.g., a particular relationship between x and y or mean-level differences at two different sites). Another advantage is that the probabilities in a Bayesian framework directly represent how likely the hypothesis is, given the data—in other words, how strongly the data support it. Lastly, the Bayesian approach naturally incorporates any prior ideas or knowledge we bring to the analysis, whether from past investigations, textbooks, or personal experience.
A problem is that doing Bayesian data analyses is not so easy. Probably the best starting point for college (and graduate school) students is JASP, but that’s not a very accessible or appropriate choice for most middle and high schoolers. So, I’ve been working hard with project team members and some great collaborators with similar interests on trying to figure out how to bring the core parts of Bayesian data analysis into K-12 — especially middle and high school — classrooms in accessible ways. Our original proposal for CREDIBLE mentioned two of these, but I now think there are quite a few more options than just those. Specifically, maybe there are five “flavors” of Bayesian data analysis for K-12 learners.
A conceptual approach — no screens or technology needed. We created an approach that tries to bring together the core parts of Bayesian data analysis - it’s more like Bayesian “reasoning”. The three parts are (A) account for what you already know, (B) be open to new evidence, and (C) consider your confidence after. So we called it the ABCs of Data!
Confidence updating — using Bayes Factors. This is the approach my colleagues Marcus Kubsch, Stefan Sorge, and others have been exploring that builds on an approach a physics educator developed around 5-10 years ago. This approach is basically the mathematical version of the first approach — the same three steps. It uses a bit of a shortcut — instead of truly being Bayesian data analysis, we basically ask learners to decide for themselves how informative the data is in terms of their hypothesis (in terms of a Bayes Factor). We created an app for this approach called the Confidence Updater. It’s simple, but it can help students to use mathematics to express their ideas. We also wrote a paper that details this.
Confidence updating — for a particular parameter. Here’s kind of a new one. This came about from discussions with a teacher yesterday. What if the Confidence Updater was less for a belief or a hypothesis that can be general (I think x and y are related), and more about a particular relationship? Learners would need to be a little bit more specific about their hypothesis — I think x and y have a correlation of r = .65, for example, or I think a one-unit increase in x is associated with a .5 increase in y, or I think there is a mean-level difference of 2.0 between sites A and B. Then students visualize the data (perhaps scatter plots for the first two, and histograms or box plots for the third) and run specific analyses — calculating correlation coefficients, estimating a linear regression model, and conducting a t-test (for those three examples). Then students use the output from those — the parameters and, heck, maybe even the output of their respective NHSTs. And then they update their confidence using same shortcut we built into the Confidence Updater. The difference here is that the hypothesis has to be much more specific; the benefit is that the conclusion can be more incisive.
Bayesian data analysis — with conjugate priors. I learned about this approach from my colleagues Mine Dogucu and Sibel Kazak. We wrote about it here in an article that came out in an issue just this week. The gist of this approach is that we use a shortcut of a very different kind than in the confidence updater — here, estimating the probability of a hypothesis given a prior and data (here, learners do not decide how informative the data is—it’s calculated through the use of a model). The shortcut is to use a conjugate prior: Normal–Normal (for a continuous, or “normal”, outcome); Gamma–Poisson (for a count, “Poisson”, outcome); and, Beta–Binomial (for a dichotomous, “Binomial”, outcome). The paper with Mine and Sibel used the Beta-Binomial since the example we used with the teachers we designed this for were analyzing data with a yes-no (i.e., dichotomous) outcome. We haven’t tried this for the other outcomes. This approach has a lot of potential; one possible limitation as I understand it is that these models are for single outcome variables, and I don’t know how we could include variables that relate to these outcomes (like if we were interested in how variable a related to variable b, or if there was a difference in y knowing the observation is from site A or B.
Bayesian data analysis — using numerical approximation. Lastly, this is what I think of as the “full-blown” Bayesian data analysis — probably using a tool like JASP, or R. This is absolutely possible for advanced high school students, but it would probably be a challenge; it is mostly used by undergraduate and graduate students in statistics classes, and by researchers and professionals looking to conduct rigorous Bayesian data analyses. It can work for a range of analyses and the software is designed to support good decision-making about the output of analyses. But, again, this tool may be pretty challenging for many of us to use!