I was sent a link to “The Bayesian Kitchen” http://www.bayesiancook.blogspot.fr/2014/02/blending-p-values-and-posterior.html and while I cannot tell for sure from from the one post, I’m afraid the kitchen might be open for cookbook statistics. It is suggested (in this post) that real science is all about “science wise” error rates (as opposed to it capturing some early exploratory efforts to weed out associations possibly worth following up on, as in genomics). Here were my comments:

False discovery rates are frequentist but they have very little to do with how well warranted a given hypothesis or model is with data. Imagine the particle physicists trying to estimate the relative frequency with which discoveries in science are false, and using that to evaluate the evidence they had for a Standard Model Higgs on July 4, 2012. What number would they use? What reference class? And why would such a relative frequency be the slightest bit relevant to evaluating the evidential warrant for the Higgs particle, nor for estimating its various properties, nor for the further testing that is now ongoing. Instead physicists use sigma levels (and associated p-values)! They show that the probability is .9999999… that they would have discerned the fact that background alone was responsible for generating the pattern of bumps they repeatedly found (in two labs). This is an error probability. It was the basis for inferring that the SM Higgs hypothesis had passed with high severity, and they then moved on to determining what magnitudes had passed with severity. That’s what science is about! Not cookbooks, not mindless screening (which might be fine for early explorations of gene associations, but don’t use that as your model for science in general).

The newly popular attempt to apply false discovery rates to “science wise error rates” is a hybrid fashion that (inadvertently) institutionalizes cookbook statistics: dichotomous “up-down” tests, the highly artificial point against point hypotheses (a null and some alternative of interest—never mind everything else), identifying statistical and substantive hypotheses, and the supposition that alpha and power can be used as a quasi-Bayesian likelihood ratio. And finally, to top it all off, by plucking from thin air the assignments of “priors” to the null and alternative—on the order of .9 and .1—this hybrid animal reports that more than 50% of results in science are false! I talk about this more on my blog errorstatistics.com

(for just one example:

Αppreciate thе recommendation. Let me try it out.

Corey I see some glints of secuussfcl communication, but I don’t get the iterated probabilities in your claim, maybe it was unintended? Out of curiosity, what price would you pay for a contract that pays out $100 if it was in fact unintended?Vg jnf havagraqrq vg’f n pbcl-a-cnfgr reebe. But that would not be a criticism. Maybe you mean Pr(H fails T; H is flawed) is near 0? (or “whether or not H holds”?) On its own it’s not a criticism. I only need to state that one of the relevant probabilities is near zero; that plus the likelihood ratio = 1 in some set of flaws F statement then imply that Pr(H fails T; H has a flaw in F) is near 0. I just picked the probability at the start of the chain of equalities. Hopefully you/readers can see* the advantage of an account that goes simply and directly to a critique of an inference (that a flaw is absent) relative to a test T. So far, we’ve treated dichotomizing the outcome as a fait accompli, thereby avoiding the sticky issue of tail areas. As a result, we are in apparent agreement about what to communicate, if not how to communicate it. Two things: first, in the dichotomous test scenario, likelihood ratio = 1 covers both what you’ve called poor tests and not a test at all . Second, I dispute that tail areas are either direct, or, for the uninitiated, simple. As I’ve mentioned previously, I’m quite impressed with severity as a lens for making sense of frequentist practice. This is because if dichotomizing the outcome of an experiment is treated as a fait accompli, then severity reasoning plus the Suppean hierarchy of models roughly replicates Jaynes’s notion of probability theory as the logic of science. (Yes, really!) But I continue to execrate the practice of computing tails areas of sampling distributions

Kar: How does Sev plus Suppes replicate Jaynes again?

Corey“I see some glints of successful communication, but I don’t get the iterated probabilities in your claim, maybe it was unintended?”Out of curiosity, what price would you pay for a contract that pays out $100 if it was in fact unintended?Vg jnf havagraqrq — vg’f n pbcl-a-cnfgr reebe.“But that would not be a criticism. Maybe you mean Pr(H fails T; H is flawed) is near 0? (or “whether or not H holds”?)”On its own it’s not a criticism. I only need to state that one of the relevant probabilities is near zero; that plus the “likelihood ratio = 1 in some set of flaws F” statement then imply that Pr(H fails T; H has a flaw in F) is near 0. I just picked the probability at the start of the chain of equalities.“Hopefully you/readers can see* the advantage of an account that goes simply and directly to a critique of an inference (that a flaw is absent) relative to a test T.”So far, we’ve treated dichotomizing the outcome as a fait accompli, thereby avoiding the sticky issue of tail areas. As a result, we are in apparent agreement about what to communicate, if not how to communicate it. Two things: first, in the dichotomous test scenario, “likelihood ratio = 1″ covers both what you’ve called “poor tests” and “not a test at all” . Second, I dispute that tail areas are either direct, or, for the uninitiated, simple. As I’ve mentioned previously, I’m quite impressed with severity as a lens for making sense of frequentist practice. This is because if dichotomizing the outcome of an experiment is treated as a fait accompli, then severity reasoning plus the Suppean hierarchy of models roughly replicates Jaynes’s notion of probability theory as the logic of science. (Yes, really!) But I continue to execrate the practice of computing tails areas of sampling distributions…