Monthly Archives: February 2014

Msc Kvetch: Is “The Bayesian Kitchen” open for cookbook statistics?

I was sent a link to “The Bayesian Kitchen” and while I cannot tell for sure from from the one post, I’m afraid the kitchen might be open for cookbook statistics. It is suggested (in this post) that real science is all about “science wise” error rates (as opposed to it capturing some early exploratory efforts to weed out associations possibly worth following up on, as in genomics). Here were my comments:

False discovery rates are frequentist but they have very little to do with how well warranted a given hypothesis or model is with data. Imagine the particle physicists trying to estimate the relative frequency with which discoveries in science are false, and using that to evaluate the evidence they had for a Standard Model Higgs on July 4, 2012. What number would they use? What reference class? And why would such a relative frequency be the slightest bit relevant to evaluating the evidential warrant for the Higgs particle, nor for estimating its various properties, nor for the further testing that is now ongoing. Instead physicists use sigma levels (and associated p-values)! They show that the probability is .9999999… that they would have discerned the fact that background alone was responsible for generating the pattern of bumps they repeatedly found (in two labs). This is an error probability. It was the basis for inferring that the SM Higgs hypothesis had passed with high severity, and they then moved on to determining what magnitudes had passed with severity. That’s what science is about! Not cookbooks, not mindless screening (which might be fine for early explorations of gene associations, but don’t use that as your model for science in general).

The newly popular attempt to apply false discovery rates to “science wise error rates” is a hybrid fashion that (inadvertently) institutionalizes cookbook statistics: dichotomous “up-down” tests, the highly artificial point against point hypotheses (a null and some alternative of interest—never mind everything else), identifying statistical and substantive hypotheses, and the supposition that alpha and power can be used as a quasi-Bayesian likelihood ratio. And finally, to top it all off, by plucking from thin air the assignments of “priors” to the null and alternative—on the order of .9 and .1—this hybrid animal reports that more than 50% of results in science are false! I talk about this more on my blog

(for just one example:

Categories: Misc Kvetching, Uncategorized | 4 Comments

Msc Kvetch: comment to Kristof at 5a.m.

My comment follows his article

Bridging the Moat Around Universities


My Sunday column is about the unfortunate way America has marginalized university professors–and, perhaps sadder still, the way they have marginalized themselves from public debate. When I was a kid, the Kennedy administration had its “brain trust” of Harvard faculty members, and university professors were often vital public intellectuals who served off and on in government. That’s still true to some degree of economists, but not of most other Ph.D programs. And we’re all the losers for that.

I’ve noticed this particularly with social media. Some professors are terrific on Twitter, but they’re the exceptions. Most have terrific insights that they then proceed to bury in obscure journals or turgid books. And when professors do lead the way in trying to engage the public, their colleagues sometimes regard them with suspicion. Academia has also become inflexible about credentials, disdaining real-world experience. So McGeorge Bundy became professor of government at Harvard and then dean of the faculty (at age 34!) despite having only a B.A.–something that would be impossible today. Indeed, some professors would oppose Bill Clinton getting a tenured professorship in government today because of his lack of a Ph.D, even though he arguably understands government today better than any other American.

In criticizing the drift toward unintelligible academic writing, my column notes that some professors have submitted meaningless articles to academic journals, as experiments, only to see them published. If I’d had more space, I would have gone through the example of Alan Sokal of NYU, who in 1996 published an article in “Social Text” that he described as: “a pastiche of left-wing cant, fawning references, grandiose quotations, and outright nonsense.” Not only was it published, but after the article was unveiled as gibberish, Social Text’s editors said it didn’t much matter: “Its status as parody does not alter, substantially, our interest in the piece, itself, as a symptomatic document.”

I hope people don’t think my column is a denunciation of academia. On the contrary, I think universities are an incredible national resource, with really smart thinking on vital national issues. I want the world to get the benefit of that thinking, not see it hidden in academic cloisters. Your thoughts on this issue?


Deborah Mayo Virginia 12 hours ago

In my own field of philosophy, the truth is that the serious work, the work that advances the ideas and research, takes place in “obscure journals or turgid books”. There are plenty of areas where this research can be directly relevant to public issues–it’s the public who should be a bit more prepared to engage with the real scholarship. Take my specialization of philosophy of statistical inference in science. Science writers appear to be only interested in repeating the popular, sexy, alarmist themes (e.g., most research is wrong, statistical significance is bogus,science fails to self-correct). Rather than research what some more careful thinkers have shown, or engage the arguments behind contrasting statistical philosophies–those semi-turgid books–, these science writers call around to obtain superficial dramatic quips from the same cast of characters. They have a one-two recipe for producing apparently radical and popular articles this way. None of the issues ever get clarified this way. I suggest the public move closer to the professional work rather than the other way around. Popular is generally pablam, at least in the U.S.

Categories: Misc Kvetching | Leave a comment

Notes (from Feb 6*) on the irrelevant conjunction problem for Bayesian epistemologists (i)



* refers to our seminar: Phil6334

I’m putting these notes under “rejected posts” awaiting feedback and corrections.

Contemporary Bayesian epistemology in philosophy appeals to formal probability to capture informal notions of “confirmation”, “support”, “evidence”, and the like, but it seems to create problems for itself by not being scrupulous about identifying the probability space, set of events, etc. and not distinguishing between events and statistical hypotheses. There is usually a presumed reference to games of chance, but even there things can be altered greatly depending on the partition of events. Still, we try to keep to that. The goal is just to give a sense of that research program. (Previous posts on the tacking paradox: Oct. 25, 2013: “Bayesian Confirmation Philosophy and the Tacking Paradox (iv)*” &  Oct 25.

(0) Simple Bayes Boost R: 

H is “confirmed” or supported by x if P(H|x) > P(H) (P(x |H)) > P(x).

H is disconfirmed (or undermined) by x if P(H|x) < P(H), (else x is confirmationally irrelevant to H).

Mayo: The error statistician would already get off at (0): probabilistic affirming the consequent is maximally unreliable, violating the minimal requirement for evidence. That could be altered with context-depedent information about how the data and hypotheses are arrived at, but this is not made explicit.

(a) Paradox of irrelevant conjunctions (‘tacking paradox’)

If x confirms H, then x also confirms (H & J), even if hypothesis J is just “tacked on” to H.[1]

Hawthorne and Fitelson (2004) define:

J is an irrelevant conjunct to H, with respect to evidence x just in case

P(x|H) = P(x|H & J).

(b) Example from earlier: For instance, x might be radioastronomic data in support of:

H: “the GTR deflection of light effect is 1.75” and

J: “the radioactivity of the Fukushima water dumped in the Pacific ocean is within acceptable levels”.

(1) Bayesian (Confirmation) Conjunction: If x Bayesian confirms H, then x Bayesian-confirms:

(H & J), where P(xH & J ) = P(x|H) for any J consistent with H

(where J is an irrelevant conjunct to H, with respect to evidence x).

If you accept R, (1) goes through.

Mayo: We got off at (0) already.  Frankly I don’t  know why Bayesian epistemologists would allow adding an arbitrary statement or hypothesis not amongst those used in setting out priors. Maybe it’s assumed J is in there somehow (in the background K), but it seems open-ended, and they have not objected.

But let’s just talk about well-defined events in a probability experiment; and limit ourselves to talking about an event providing evidence of another event (e.g., making it more or less expected) in some sense. In one of Fitelson’s examples,   P(black|ace of spade) > P(black), so “black” confirms it’s an ace of spades (presumably in random drawings of card color from an ordinary deck)–despite “ace” being an “irrelevant conjunct” of sorts. Even so, if someone says data x (it’s a stock trader) is evidence it’s an inside trader in a hedge firm, I think it would be assumed that something had been done to probe the added conjuncts.

(2) Using simple B-boost R: (H & J) gets just as much of a boost by x as does H—measuring confirmation as a simple B-boost: R.

CR(H, x) = CR((HJ), x) for irrelevant conjunct J.

R: P(H|x)/P(H) (or equivalently, P(x |H)/P(x))

(a) They accept (1) but (2) is found counterintuitive (by many or most Bayesian epistemologists). But if you’ve defined confirmation as a B-boost, why run away from the implications? (A point Maher makes.) It seems they implicitly slide into thinking of what many of us want:

some kind of an assessment of how warranted or well-tested H is (with x and background).

(not merely a ratio which, even if we can get it, won’t mean anything in particular. It might be any old thing, 2, 22, even with H scarcely making x expected.).

(b) The intuitive objection according to Crupi and Tentori (2010) is this (e.g., p. 3): “In order to claim the same amount of positive support from x to a more committal theory “H and J” as from x to H alone, …adding J should contribute by raising further how strongly x is expected assuming H by itself. Otherwise, what would be the specific relevance of J?” (using my letters, emphasis added)

But the point is that it’s given that J is irrelevant. Now if one reports all the relevant information for the inference, one might report something like (H & J) makes x just as expected as H alone. Why not object to the “too was” confirmation (H & J) is getting when nothing has been done to probe J? I think the objection is, or should be, that nothing has been done to show J is the case rather than not. P(x |(H & J)) = P(x |(H & ~J)).

(c) Switch from R to LR: What Fitelson (Hawthorne and Fitelson 2004) do is employ, as a measure of the B-boost, what some call the likelihood ratio (LR).

CLR(H, x) = P(x | H)/P(x | ~H).

(3) Let x confirm H, then

(*) CLR(H, x) > CLR((HJ), x)

For J an irrelevant conjunct to H.

So even though x confirms (H & J) it doesn’t get as much as H does, at least if one uses LR. (It does get as much using R).

They see (*) as solving the irrelevant conjunct problem.

(4) Now let x disconfirm Q, and x confirm ~Q, then

(*) CLR(~Q, x) > CLR((~Q & J), x)

For J an irrelevant conjunct to Q: P(x|Q) = P(x|J & Q).

Crupi and Tentori (2010) notice an untoward consequence of using LR confirmation in the case of disconfirmation (substituting their Q for H above): If x disconfirms Q, then (Q & J) isn’t as badly disconfirmed as Q is, for J an irrelevant conjunct to Q. But this just follows from (*), doesn’t it? That is, from (*), we’d get (**) [possibly an equality goes somewhere.)

(**) CLR(Q, x) < CLR((Q & J), x).

This says that if x disconfirms Q, (Q & J) isn’t as badly disconfirmed as Q is. This they find counterintuitive.

But if (**) is counterintuitive, then so is (*).

(5) Why (**) makes sense if you wish to use LR:

The numerators in the LR calculations are the same:

P(x|Q & J) = P(x|Q) and P(x|H & J) = P(x|H) since in both cases J is an irrelevant conjunct.

But P(x|~(Q & J)) < P(x|~Q)

Since x disconfirms Q, x is more probable given ~Q than it is given (~Q v ~J). This explains why

(**) CLR(Q, x) < CLR((Q & J), x)

So if (**) is counterintuitive then so is (*).

(a) Example Qunready for college.

If x = high scores on a battery of college readiness tests, then x disconfirms Q and confirms ~Q.

What should J be? Suppose having one’s favorite number be an even number (rather than an odd number) is found irrelevant to scores.

(i) P(x|~(Q & J)) = P(high scores| either college ready or ~J)

(ii) P(x|~Q ) = P(high scores| college ready)

(ii) might be ~1 (as in the earlier discussion), while (i) considerable less.

The high scores can occur even among those whose favorite number is odd.This explains why

(**) CLR(Q, x) < CLR((Q & J), x)

In the case where x confirms H, it’s reversed

P(x |~(H & J)) > P(x |~H)

(b) Using one of Fitelson’s examples, but for ~Q:

e.g., Q: not-spades    x: black      J: ace

P(x |~Q) = 1.

P(x | Q)= 1/3

P(x |~(Q & J)) = 25/49

i.e., P(black|spade or not ace)=25/49

Note: CLR [(Q & J), x) = P(x |(Q & J))/P(x |~(Q & J))

Please share corrections, questions.

Previous slides are:


Chalmers (1999). What Is This Thing Called Science, 3rd ed. Indianapolis; Cambridge: Hacking.

Crupi & Tentori (2010). Irrelevant Conjunction: Statement and Solution of a New Paradox, Phil Sci, 77, 1–13.

Hawthorne & Fitelson (2004). Re-Solving Irrelevant Conjunction with Probabilistic Independence, Phil Sci 71: 505–514.

Maher (2004). Bayesianism and Irrelevant Conjunction, Phil Sci 71: 515–520.

Musgrave (2010) “Critical Rationalism, Explanation, and Severe Tests,” in Error and Inference (D.Mayo & A. Spanos eds). CUP: 88-112.

[1] Chalmers and Musgrave say I should make more of how simply severity solves it, notably for distinguishing which pieces of a larger theory rightfully receive evidence, and a variety of “idle wheels” (Musgrave, p. 110.)

Categories: phil6334 rough drafts | 3 Comments

Blog at