My understanding of the bayesian vs frequentist debate is that frequentist statistics:
- is (or claims to be) objective
- or at least unbiased
- so different researchers, using different assumptions can still get quantitatively comparable results
while bayesian statistics
- claims to make “better” predictions (i.e. lower expected loss), because it can use prior knowledge (among other reasons)
- needs fewer “ad hoc” choices, replacing them by prior/model choices that (at least in principle) have a real-world interpretation.
Given that, I would have expected that bayesian statistics would be very popular in SPC: If I were a factory owner trying to control my process quality, I would primarily care about expected loss; If I could reduce that, because I have more/better prior knowledge than my competitors, even better.
But practically everything I have read about SPC seems to be firmly frequentist (i.e. no prior distributions, point estimates of all parameters, many ad-hoc choices about sample size, p-values etc.)
Why is that? I can see why frequentist statistics were a better choice in the 1960’s, when SPC was done using pen and paper. But why hasn’t anyone tried different methods since then?
WARNING I wrote this answer a long time ago with very little idea what I was talking about. I can’t delete it because it’s been accepted, but I can’t stand behind most of the content.
This is a very long answer and I hope it’ll be helpful in some way. SPC isn’t my area, but I think these comments are general enough that they apply here.
I’d argue that the most-oft-cited advantage — the ability to incorporate prior beliefs — is a weak advantage applied/empirical fields. That’s because you need to quantify your prior. Even if I can say “well, level z is definitely implausible,” I can’t for the life of me tell you what should happen below z. Unless authors start publishing their raw data in droves, my best guesses for priors are conditional moments taken from previous work that may or may not have been fitted under similar conditions to the ones you’re facing.
Basically, Bayesian techniques (at least on a conceptual level) are excellent for when you have a strong assumption/idea/model and want to take it to data, then see how wrong or not wrong you turn out to be. But often you are not looking to see whether you’re right about one particular model for your business process; more likely you have no model, and are looking to see what your process is going to do. You do not want to push your conclusions around, you want your data to push your conclusions. If you have enough data, that’s what will happen anyway, but in that case why bother with the prior? Perhaps that’s overly skeptical and risk-averse, but I’ve never heard of an optimistic businessman that was also successful. There is no way to quantify your uncertainty about your own beliefs, and you would rather not run the risk of being overconfident in the wrong thing. So you set an uninformative prior and the advantage disappears.
This is interesting in the SPC case because unlike in, say, digital marketing, your business processes aren’t forever in an unpredictable state of flux. My impression is that business processes tend to change deliberately and incrementally. That is, you have a long time to build up good, safe priors. But recall that priors are all about propagating uncertainty. Subjectivity aside, Bayesianism has the advantage that it objectively propagates uncertainty across deeply-nested data generating processes. That, to me, is really what Bayesian statistics is good for. And if you’re looking for reliability of your process well beyond the 1-in-20 “significance” cutoff, it seems like you would want to account for as much uncertainty as possible.
So where are the Bayesian models? First off, they’re hard to implement. To put it bluntly, I can teach OLS to a mechanical engineer in 15 minutes and have him cranking out regressions and t-tests in Matlab in another 5. To use Bayes, I first need to decide what kind of model I’m fitting, and then see if there’s a ready-made library for it in a language someone at my company knows. If not, I have to use BUGS or Stan. And then I have to run simulations to get even a basic answer, and that takes about 15 minutes on an 8-core i7 machine. So much for rapid prototyping. And second off, by the time you get an answer, you’ve spent two hours of coding and waiting, only to get the same result as you could have with frequentist random effects with clustered standard errors. Maybe this is all presumptuous and wrongheaded and I don’t understand SPC at all. But I see it in academia and in for-profit social science constantly, and I’d be surprised if things were different in other fields.
I liken Bayesianism to a very high-quality chef knife, a stockpot, and a sautee pan; frequentism is like a kitchen full of As-Seen-On-TV tools like banana slicers and pasta pots with holes in the lid for easy draining. If you’re a practiced cook with lots of experience in the kitchen–indeed, in your own kitchen of substantive knowledge, which is clean and organized and you know where everything is located–you can do amazing things with your small selection of elegant, high-quality tools. Or, you can use a bunch of different little ad-hoc* tools, that require zero skill to use, to make a meal that’s simple, really not half bad, and has a couple basic flavors that get the point across. You just got home from the data mines and you’re hungry for results; which cook are you?
*Bayes is just as ad-hoc, but less transparently so. How much wine goes in your coq au vin? No idea, you eyeball it because you’re a pro. Or, you can’t tell the difference between a Pinot Grigio and a Pinot Noir but the first recipe on Epicurious said to use 2 cups of the red one so that’s what you’re going to do. Which one is more “ad-hoc?”