Msc Kvetch: “Stat Fact?”: Scientists do and do not want subjective probability reports


stat fact?

Facts are true claims, in contrast with mere opinions, or normative claims. John Cook supplies this “stat fact”: scientist do and do not want subjective posterior probabilities. Which is it? And are these descriptions of the different methods widely accepted “facts”? I’ve placed this in my “rejected posts” (under msc kvetching” simply because I don’t take this seriously enough to place on my regular blog.

Putting the methods you use into context

It may come as a surprise, but the way you were probably taught statistics during your undergraduate years is not the way statistics is done. …It might be disconcerting to learn that there is no consensus amongst statisticians about what a probability is for example (a subjective degree of belief, or an objective long-run frequency?). …..


Bayesian methods are arguably the oldest; the Rev. Thomas Bayes published his theorem (posthumously) in 1764. …The controversial part of Bayesian methodology is the prior information (which is expressed as a distribution and often just referred to as “a prior”), because two people may have different prior knowledge or information and thus end up with a different result. In other words, individual knowledge (or subjective belief) can be combined with experimental data to produce the final result. Probabilities are therefore viewed as personal beliefs, which naturally differ between individuals. This doesn’t sit well with many scientists because they want the data “to speak for themselves”, and the data should give only one answer, not as many answers as there are people analysing it! It should be noted that there are also methods where the prior information is taken from the data itself, and are referred to as empirical Bayesian methods, which do not have the problem of subjectivity. ….The advantage of Bayesian methods is that they can include other relevant information and are therefore useful for integrating the results of many experiments. In addition, the results of a Bayesian analysis are usually what scientists want, in terms of what p-values and confidence intervals represent.


Note that the references include only critics of standard statistical methods; not even Cox is included. Stat fact: this is not a statistically representative list?

Msc Kvetch: Why isn’t the excuse for male cheating open to women?



In an op-ed in the NYT Sunday Review (May 24, 2015), “Infidelity Lurks in Your Genes,” Richard Friedman states that:

We have long known that men have a genetic, evolutionary impulse to cheat, because that increases the odds of having more of their offspring in the world.

But now there is intriguing new research showing that some women, too, are biologically inclined to wander, although not for clear evolutionary benefits.

I’ve never been sold on this evolutionary explanation for male cheating, but I wonder why it’s assumed women wouldn’t be entitled to it as well. For the male’s odds of having more offspring to increase, the woman has to have the baby, so why wouldn’t the woman also get the increased odds of more offspring? It’s the woman’s offspring too. Moreover, the desire to have babies tends to be greater among women than men.

Msc kvetch: What really defies scientific sense


Texas sharpshooter

“two problems that plague frequentist inference: multiple comparisons and multiple looks, or, as they are more commonly called, data dredging and peeking at the data. The frequentist solution to both problems involves adjusting the P-value. .. But adjusting the measure of evidence because of considerations that have nothing to do with the data defies scientific sense, belies the claim of ‘objectivity’ that is often made for the P-value.” (S.Goodman 1999, p. 1010, Annals of Internal Medicine 130 (12)

What defies scientific sense, as I see it, are accounts of evidence that regard biasing techniques, such as data dredging, as “having nothing to do” with the evidence. Since these gambits open the door to handy ways of verification bias and high probability of finding impressive-looking patterns erroneously, they are at the heart of today’s criticisms of unreplicable statistics. The point of registration, admitting multiple testing, multiple modeling, cherry-picking, p-hacking, etc. is to combat the illicit inferences so readily enabled by ignoring them. Yet, we have epidemiologists like Goodman, and many others, touting notions of inference (likelihood ratios and Bayes factors) that proudly declare these ways of cheating irrelevant to evidence. Remember, by declaring them irrelevant to evidence, there is also no onus to mention that one found one’s hypothesis by scouring the data. Whenever defendants need to justify their data-dependent hypotheses (a practice that can even run up against legal statutes for evidence), they know whom to call.[i]

In this connection, consider the “replication crisis in psychology”. They often blame significance tests with permitting p-values too readily. But then, why are they only able to reproduce something like 30% of the previously claimed effects? What’s the answer? The implicit answer is that those earlier studies engaged in p-hacking and data dredging. All the more reason to want  an account that picks up on such shenanigans rather than ignore them as irrelevant to evidence. Data dredging got you down? Here’s the cure: Use methods that regard such QRPs as irrelevant to evidence. Of course the replication project goes by the board: what are they going to do, check if they can get as good a likelihood ratio by replicating the data dredging?

[i] There would be no rationale for Joe Simmons, Leif Nelson and Uri Simonsohn’s suggestion for “A 21-word solution”: “Many support our call for transparency, and agree that researchers should fully disclose details of data collection and analysis. Many do not agree. What follows is a message for the former; we begin by preaching to the choir. Choir: There is no need to wait for everyone to catch up with your desire for a more transparent science. If you did not p-hack a finding, say it, and your results will be evaluated with the greater confidence they deserve. If you determined sample size in advance, say it. If you did not drop any variables, say it. If you did not drop any conditions, say it. The Fall 2012 Newsletter for the Society for Personality and Social Psychology See my Statistical Dirty Laundry post in my regular blog.

Too swamped to read about ‘the swamping problem’ in epistemology, but…



I was sent an interesting paper that is a quintessential exemplar of analytic epistemology. It’s called “What’s the Swamping Problem?” (by Duncan Prichard), and was tweeted to me by a philosophy graduate student, George Shiber. I’m too tired and swamped to read the fascinating ins and outs of the story. Still, here are some thoughts off-the-top of my head that couldn’t be squeezed into a tweet. I realize I’m not explaining the problem, that’s why this is in “rejected posts”–I didn’t accept it for the main blog. (Feel free to comment. Don’t worry, absolutely no one comes here unless I direct them through the swamps.)

1.Firstly, it deals with a case where the truth of some claim is given whereas we’d rarely know this. The issue should be relevant to the more typical case. Even then, it’s important to be able to demonstrate and check why a claim is true, and be able to communicate the reasons to others. In this connection, one wants information for finding out more things and without the method you don’t get this.

  1. Second, the goal isn’t merely knowing isolated factoids but methods. But that reminds me that nothing is said about learning the method in the paper. There’s a huge gap here. If knowing, is understood as true belief PLUS something, then we’ve got to hear what that something is. If it’s merely reliability without explanation of the method,(as is typical in reliabilist discussions) no wonder it doesn’t add much, at least wrt that one fact. It’s hard even to see the difference, unless the reliable method is spelled out. In particular, in my account, one always wants to know how to recognize and avoid errors in ranges we don’t yet know how to probe reliably. Knowing the method should help extend knowledge into unknown territory.
  2. We don’t want trivial truths. This is what’s wrong with standard confirmation theories, and where Popper was right. We want bold, fruitful, theories that interconnect areas in order to learn more things. I’d rather know how to spin-off fabulous coffee makers using my 3-D printer, say, then have a single good coffee now. The person who doesn’t care how a truth was arrived at is not a wise person. The issue of “understanding” comes up (one of my favorite notions), but little is said as what it amounts to.
  1. Also overlooked on philosophical accounts is the crucial importance of moving from unreliable claims to reliable claims (e.g., by averaging, in statistics.) . I don’t happen to think knowing merely that the method is reliable is of much use, w/o knowing why, w/o learning how specific mistakes were checked, errors are made to ramify to permit triangulation, etc.
  1. Finally, one wants an epistemic account that is relevant for the most interesting and actual cases, namely when one doesn’t know X or is not told X is a true belief. Since we are not given that here (unless I missed it) it doesn’t go very far.
  1. Extraneous: On my account, x is evidence for H only to the extent that H is well tested by x. That is, if x accords with H, it is only evidence for H to the extent that it’s improbable the method would have resulted in so good accordance if H is false. This goes over into entirely informal cases. One still wants to know how capable and incapable the method was to discern flaws.
  1. Related issues, though it might not be obvious at first, concerns the greater weight given to a data set that results from randomization, as opposed to the same data x arrived at through deliberate selection.

Or consider my favorite example: the relevance of stopping rules. People often say that if data x on 1009 trials achieves statistical significance at the .05 level, then it shouldn’t matter if x arose from a method that planned on doing 1009 trials all along, or one that first sought significance after the first 10, and still not getting it went on to 20, then 10 more and 10 more until finally at trial 1009 significance was found. The latter case involves what’s called optional stopping. In the case of, say, testing or estimating the mean of a Normal distribution the optional stopping method is unreliable, at any rate, the probability it erroneously infers significance is much higher than .05. It can be shown that this stopping rule is guaranteed to stop in finitely trials and reject the null hypothesis, even though it is true. (Search optional stopping on errorstatistics.com)

I may add to this later…You can read it: What Is The Swamping Problem

Msc. kvetch: Are you still fully dressed under your clothes?

UnknownMen have a constitutional right to take pictures under women’s skirts. Yup. That’s what the Massachusetts courts have determine after one Michael Robertson was caught routinely taking pictures and videos up the skirts of women. It even has a name: upskirting.

The Supreme Judicial Court overruled a lower court decision that had upheld charges against Michael Robertson, who was arrested in August 2010 by transit police who set up a sting after getting reports that he was using his cellphone to take photos and video up female riders’ skirts and dresses.

Robertson had argued that it was his constitutional right to do so…..

“A female passenger on a MBTA trolley who is wearing a skirt, dress or the like covering these parts of her body is not a person who is ‘partially nude,’ no matter what is or is not underneath the skirt..”

Link is here.

But this is absurd: she IS partially nude under her clothing, even if she isn’t when you don’t look up her skirt! The picture Robertson took is not of her fully clothed.

People are fully clothed when the TSA conducts whole body scans in airports (a practice that’s largely ended), and yet the pictures would be of the person naked. If you can be partially naked when an instrument sees through your clothes, then you can be partially naked when a cell phone is held under your skirt. Do we really have to get philosophical about these terms…?

Meanwhile, they’re busy trying to pass a law against upskirting in MA. So are guys in Boston  busy getting all the constitutional shots they can in the mean time?

Chris Dearborn, a law professor at Suffolk University in Boston, said the court’s ruling served as a signal to the legislature to act fast, but also likely had Peeping Toms briefly “jumping for joy”. Link is here.

Jumping for joy at violating a woman’s privacy? What kind of Neanderthals are in Boston these days?


Seems like upskirting is back in the news, this time in Georgia. Under your skirt is not really a private place, after all. http://time.com/4422772/upskirt-photos-harassment/

Msc Kvetch: Is “The Bayesian Kitchen” open for cookbook statistics?

I was sent a link to “The Bayesian Kitchen” http://www.bayesiancook.blogspot.fr/2014/02/blending-p-values-and-posterior.html and while I cannot tell for sure from from the one post, I’m afraid the kitchen might be open for cookbook statistics. It is suggested (in this post) that real science is all about “science wise” error rates (as opposed to it capturing some early exploratory efforts to weed out associations possibly worth following up on, as in genomics). Here were my comments:

False discovery rates are frequentist but they have very little to do with how well warranted a given hypothesis or model is with data. Imagine the particle physicists trying to estimate the relative frequency with which discoveries in science are false, and using that to evaluate the evidence they had for a Standard Model Higgs on July 4, 2012. What number would they use? What reference class? And why would such a relative frequency be the slightest bit relevant to evaluating the evidential warrant for the Higgs particle, nor for estimating its various properties, nor for the further testing that is now ongoing. Instead physicists use sigma levels (and associated p-values)! They show that the probability is .9999999… that they would have discerned the fact that background alone was responsible for generating the pattern of bumps they repeatedly found (in two labs). This is an error probability. It was the basis for inferring that the SM Higgs hypothesis had passed with high severity, and they then moved on to determining what magnitudes had passed with severity. That’s what science is about! Not cookbooks, not mindless screening (which might be fine for early explorations of gene associations, but don’t use that as your model for science in general).

The newly popular attempt to apply false discovery rates to “science wise error rates” is a hybrid fashion that (inadvertently) institutionalizes cookbook statistics: dichotomous “up-down” tests, the highly artificial point against point hypotheses (a null and some alternative of interest—never mind everything else), identifying statistical and substantive hypotheses, and the supposition that alpha and power can be used as a quasi-Bayesian likelihood ratio. And finally, to top it all off, by plucking from thin air the assignments of “priors” to the null and alternative—on the order of .9 and .1—this hybrid animal reports that more than 50% of results in science are false! I talk about this more on my blog errorstatistics.com

(for just one example:


Msc Kvetch: comment to Kristof at 5a.m.

My comment follows his article

Bridging the Moat Around Universities


My Sunday column is about the unfortunate way America has marginalized university professors–and, perhaps sadder still, the way they have marginalized themselves from public debate. When I was a kid, the Kennedy administration had its “brain trust” of Harvard faculty members, and university professors were often vital public intellectuals who served off and on in government. That’s still true to some degree of economists, but not of most other Ph.D programs. And we’re all the losers for that.

I’ve noticed this particularly with social media. Some professors are terrific on Twitter, but they’re the exceptions. Most have terrific insights that they then proceed to bury in obscure journals or turgid books. And when professors do lead the way in trying to engage the public, their colleagues sometimes regard them with suspicion. Academia has also become inflexible about credentials, disdaining real-world experience. So McGeorge Bundy became professor of government at Harvard and then dean of the faculty (at age 34!) despite having only a B.A.–something that would be impossible today. Indeed, some professors would oppose Bill Clinton getting a tenured professorship in government today because of his lack of a Ph.D, even though he arguably understands government today better than any other American.

In criticizing the drift toward unintelligible academic writing, my column notes that some professors have submitted meaningless articles to academic journals, as experiments, only to see them published. If I’d had more space, I would have gone through the example of Alan Sokal of NYU, who in 1996 published an article in “Social Text” that he described as: “a pastiche of left-wing cant, fawning references, grandiose quotations, and outright nonsense.” Not only was it published, but after the article was unveiled as gibberish, Social Text’s editors said it didn’t much matter: “Its status as parody does not alter, substantially, our interest in the piece, itself, as a symptomatic document.”

I hope people don’t think my column is a denunciation of academia. On the contrary, I think universities are an incredible national resource, with really smart thinking on vital national issues. I want the world to get the benefit of that thinking, not see it hidden in academic cloisters. Your thoughts on this issue?


Deborah Mayo Virginia 12 hours ago

In my own field of philosophy, the truth is that the serious work, the work that advances the ideas and research, takes place in “obscure journals or turgid books”. There are plenty of areas where this research can be directly relevant to public issues–it’s the public who should be a bit more prepared to engage with the real scholarship. Take my specialization of philosophy of statistical inference in science. Science writers appear to be only interested in repeating the popular, sexy, alarmist themes (e.g., most research is wrong, statistical significance is bogus,science fails to self-correct). Rather than research what some more careful thinkers have shown, or engage the arguments behind contrasting statistical philosophies–those semi-turgid books–, these science writers call around to obtain superficial dramatic quips from the same cast of characters. They have a one-two recipe for producing apparently radical and popular articles this way. None of the issues ever get clarified this way. I suggest the public move closer to the professional work rather than the other way around. Popular is generally pablam, at least in the U.S.

“No shame” psychics keep their predictions vague

imagesFor some reason, science debunker Goldacre’s blogpost below makes me take him slightly less seriously. It’s as if he’s saying, it’s no shame in giving psychic pronouncments to parents with missing children–people who obviously might be devastated or misled as a result–so long as you’re not found wrong. Does anyone else see it this way?

Shame on you, Sylvia Browne, for telling Amanda Berry’s mother her daughter was dead.

May 7th, 2013 by Ben Goldacre in just a blog 
The story of Amanda Berry’s rescue in Cleveland – after ten years in captivity – is extraordinary. In 2004, popular psychic Sylvia Brown told Amanda’s mother that her little girl was dead. Here is a contemporaneous account of that show.
Amanda Berry’s mother traveled to New York to tell her story to Psychic Sylvia Browne on the Montel Williams Show. The show was a shot at getting her daughter’s picture before the eyes of millions of Americans. “On April 21st 2003, 16-year-old Amanda Berry left her part-time job never to be seen again,” the show began. With that, TV viewers across America now know a girl from Cleveland is missing. But Amanda Berry’s mom wanted more than her daughter’s picture on national TV. She wants answers. “Can you tell me…Is she out there?” Berry’s mother Louwana Miller asked. “I hate when they’re in the water,” Browne said. “She’s not alive honey.” It was bad news from the world-renowned psychic. It’s what Miller didn’t want to hear. “So you don’t think I’ll ever see her again,” Miller said. “Yeah in Heaven on the other side,” Browne responded. “I’m sorry.” Montel took a commercial break and Amanda’s mom broke down.
It has been widely reported in the last 24 hours that Amanda Berry’s mother died in 2006 of a broken heart: certainly she must have endured appalling anguish over her last years. It would be nice if people like Sylvia Browne could deliver their stage entertainment with a bit more consideration. Until hell freezes over, we can at least draw attention to these horrible episodes.
Given that fortune-telling (on TV or live) lacks any scientific validity whatsoever (and the FBI said Browne had never been of any help), what can Goldacre’s chiding mean, except perhaps to suggest that an honorable psychic should be sure to keep her predictions ultra vague? What if she’d predicted her daughter was alive, or what if her daughter had turned out to be dead, would Goldacre declare there was no shame?
msc kvetch: air traffic control cuts?

imagesDoes it really make sense to cut air-traffic control? Weren’t there already too few wide-awake folks in the towers? Here I am delayed at LGA, NY, but I will say that Delta is impressively laying out free cold drinks and snacks. I just don’t see how airlines can function with so much unpredictable regulatory control. Have you noticed airline fares have gone through the roof in the last year? Strangely, I hear no one talking about it, but I’m pretty sure it has a lot to do with airlines being required (by a newly imposed law) to give people 24 hours to change a ticket without penalty–in case they make a mistake– and the huge new TSA taxes, which, incidentally, airlines are required to combine with the base price so you cannot even see how much it is. I’ve also noticed that prices are essentially identical across airlines and travel websites (so far as I can tell), whereas there used to be a lot of variability. Nor is it just that I trade in airlines—in fact airline stocks are at close to their near term highs. Not that the companies themselves are profitable; they’re not. It’s only that they were so low last year (e.g., DAL from ~$7-$17, search philstock in this blog, if interested). My kvetch is  that the U.S.depends on people being able to fly, and yet there’s much more intervention in the “private” airline business than other industries, so far as I’m aware. Well I guess we’ll be seeing those blades and machetes soon (with the new ruling, search my regular blog).

On board the captain apologizes for the lateness, makes it clear it was not their fault but actually required by the FAA, and that we should all write to our representatives in Congress!

Rejected post: Filly Fury

Whoa Nelly!  When I first heard stories being trotted out last month about the fury over horsemeat in “beef” products in the UK, I thought that given how much is riding on public trust, the complaints would spur food inspection agencies to have reined in the problem by now. But I hear that Britain’s Tesco and Burger King are being saddled with new findings, making a lot of people skittish even here in the U.S. This could prove a boon to McDonald’s long jockeying with Burger King in the fast food market. At first Tesco bridled at the accusations (declaring the rumors “horse%$#@”), but once the equine DNA was tracked, the horse was out of the barn and they had to take out a full page ad to apologize. Possibly from a crude p-value analysis it was concluded:

“The early results from Findus UK’s internal investigation strongly suggests that the horsemeat contamination in Beef Lasagne was not accidental.”

The horsemeat could well have been sold for quite some time it has been revealed, given that tests for horse DNA have not been conducted in donkey’s years!

On Thursday, the scandal deepened further with the news that horsemeat had been found in Findus ready meals made in France, prompting the British government to call it “very distasteful” .

French Agriculture Minister Stephane Le Foll said there would be an investigation there: “We need to avoid this idea that there was some desire to hide things,” he told BFM television.

Clearly, they could not have been deliberately hiding things: one of the companies is even called “Findus”.  Nor would they ever try to stall the investigations now cropping up all over.

In an article in the Mirror, the problem is linked to people living hand to mouth:

Findus beef lasagne sold at £1.60 – for 360g of alleged beef, tomatoes, onions, herbs, white sauce and pasta.… Why did none of us work out sooner that if they were flogging it for £1.60 something was amiss?

Elsewhere I read that France’s agriculture minister issued a warning

that companies found to have knowingly misled consumers would be ‘severely punished’.

Possibly even horsewhipped! To ease the fury, some lawmakers in the UK are becoming galloping gourmets:

Two senior lawmakers advised on Friday against eating processed beef products, but Paterson said he would happily eat them and Cameron insisted there was no health risk.

“There is no reason to believe that any frozen food currently on sale is unsafe or a danger to health. It’s not so much about food safety, it’s about proper food labeling, it’s about confidence in retailers,” Cameron said.

Experts say horsemeat could contain traces of veterinary drug phenylbutazone, or “bute”, used as a painkiller, which can be harmful to humans but only in high concentrations.

However, the danger of eating such meat may be slight: “The idea that you might get a clinically significant amount in horsemeat, even after therapeutic administration to the horse is, frankly, daft,” said Colin Berry, a professor of pathology at Queen Mary, University of London.

Perhaps he’s being groomed for a policy post. The following timeline posted in the Guardian shows the race is on to reveal higher and higher percentages!

16 January

The Food Safety Authority of Ireland says beefburgers with traces of equine DNA, including one product classed as 29% horse, are being supplied to supermarkets by Silvercrest Foods in Ireland and Dalepak Hambleton in Yorkshire, subsidiaries of the ABP Food Group.

4 February

Production at a second meat supplier, Rangeland Foods in Co Monaghan, is suspended after 75% equine DNA is found in raw ingredients, the Irish department of agriculture confirms…..

7 February

The Food Standards Agency reveals a second case of “gross contamination” after some Findus UK beef lasagnes are found to contain up to 100% horsemeat. The products were made by Comigel.

The New York Times, also running with the story, reports that the chief executive of the Food Safety Authority of Ireland, Alan Reilly, said that meat was being deliberately mislabeled.

“We are no longer talking about trace amounts,” he told RTE, the national broadcaster. “We are talking about horse meat. Somebody, someplace, is drip-feeding horse meat into the burger manufacturing industry. We don’t know exactly where this is happening.”

But they may now have identified a horsemeat lasagna factory that looks pretty fishy:

Sprawling on a frozen plain in an isolated part of central Europe, the huge Comigel food factory appears a deeply sinister place….

The production plant, accused of being the source of horse meat-laden ready meals which have flooded the UK food market, looks like a cross between a prison and a crematorium.

The Tavola factory specialises in ready-made frozen meals, producing an astonishing 16,000 tonnes a year.

At the end of the article are some interesting charts on the statistics of horse meat production around the world.

No closing the barn door now, the inquiry has taken off!  In the mean time, enjoy your filly cheese steak! Is it horse or not? That is equestrian.

Send me related updates for this post from your neigh-borhood.

News Updates:

(1) Is this a good analogy?


Agriculture Minister Stephane Le Foll said regulators weren’t at fault.

“This is not a regulation failure,” he said. “We have to stop saying that just because there is a fraud. That’s like saying that just because there are police officers around and that an accident happens, there is a failure on the part of the police officers.”

(2) Carmolimp?  Mere labeling issue?


Meanwhile, one Romanian producer that processes horse meat, Carmolimp, called the French assertions against Romanian producers “shameful” and an “unprecedented attack” without merit. “If the horse meat left Romania, then it would have been only labeled as horse meat,” Olimpiu Soneriu, the director of Carmolimp, said in a statement. He added that horse meat and beef were easily differentiated by their texture.

…. “It is just a labeling issue,” Frederic Vincent, a spokesman for health and consumer policy at the European Commission, told reporters at a regular briefing in Brussels. “As far as I know, the meat in question has not been contaminated in any way.”


Categories: Misc Kvetching, rejected posts | 10 Comments

