junk science

Can today’s nasal spray influence yesterday’s sex practices? Non-replication isn’t nearly critical enough: Rejected post


Blame it on the nasal spray

Sit down with students or friends and ask them what’s wrong with this study–or just ask yourself–and it will likely leap out. Now I’ve only read the paper quickly, and know nothing of oxytocin (OT) research. That’s why this is in the “Rejected Posts” blog. Plus, I’m writing this really quickly.

You see, I noticed a tweet about how non-statistically significant results are often stored in filedrawers, rarely published, and right away was prepared to applaud the authors for writing on their negative result. Now that I see the nature of the study, and the absence of any critique of the experiment itself (let alone the statistics), I am less impressed. What amazes me about so many studies is not that they fail to replicate but that the glaring flaws in the study aren’t considered! But I’m prepared to be corrected by those who do serious oxytocin research.

In a nutshell: Treateds get OT nasal spray, controls get a placebo spray; you’re told they’re looking for effects on sexual practices, when actually they’re looking for effects on trust bestowed upon experimenters with your answers).

The instructions were the follows: “You will now perform a task on the computer. The instruction concerning this task will appear on screen but if you have any question, do not hesitate. At the end of the computer test, you will have to fill a questionnaire that is in the envelope on your desk. As we want to examine if oxytocin has an influence on sexual practices and fantasies, do not be surprised by the intimate or awkward nature of the questions. Please answer as honestly as possible. You will not be judged. Also do not be afraid of being sincere in your answers, I will not look at your questionnaire, I swear it. It will be handled by one of the guy in charge of the optical reading device who will not be able to identify you (thanks to the coding system). At the end of the experiment, I will bring him all the questionnaires. I will just ask you to put the questionnaire back in the envelope once it is completed. You may close the envelope at the end and, if you want, you may even add tape. There is a tape dispenser on your desk”. There is some examples of questions they were asked to answer: “What was your wildest sex experiment ?”, “Are you satisfied with your sex life? Could you describe it? (frequency, quality,…)” Please report on a 7-point Likert scale (1 = not at all, it disgusts me à 7 = very much, I really like) your willingness to be involved in the following sexual practices: using sex toys, doing a threesome, having sex in public, watch other people having sex, watch porn before or during a sexual intercourse,…”

Imagine you’re a subject in the study. Is there a reason to care if the researcher knows details of your sex life? The presumption is that you do care. But anyone who really cared wouldn’t reveal whatever they deemed so embarrassing. But wait, there’s another crucial element to this experiment.

We’re told: “we want to examine if oxytocin has an influence on sexual practices and fantasies“. You’ve been sprayed with either OT or placebo, and I assume you don’t know which. Suppose OT does influence willingness to engage in wild sex experiments. Being sprayed today couldn’t very well change your previous behavior. So unless they had asked you last week (without spray) and now once again with spray, they can’t be looking for changes on actual practice. But OT spray could make you more willing to say you’re more willing to engage in “the following sexual practices: using sex toys, doing a threesome, having sex in public,…etc. etc.”  It could also influence feelings right now, i.e., how satisfied you feel now that you’ve been “treated”. So since the subject reasons this must be the effect they have in mind, only scores on the “willingness” and “current feelings” questions could be picking up on the OT effect. But high numbers on willingness and feelings questions don’t reflect actual behaviors–unless the OT effect extends to exaggerating about past behaviors, that is, lying about them, in which case, once again, your own actual choices and behaviors in life are not revealed by the questionnaire. Given the tendency of subjects to answer as they suppose the researcher wants, I can imagine higher numbers on such questions (than if they weren’t told they’re examining if OT has an influence on sexual practices). But since the numbers don’t, indeed, can’t reflect true effects on sexual behavior, there’s scarce reason to regard them as private information revealed only to experimenters you trust.  I’ll bet almost no one uses the tape*.

There are many, alternative criticisms of this study. For example, realizing they can’t be studying the influence of sex practices, you already mistrust the experimenter. Share yours.

Let me be clear: I don’t say OT isn’t related to love and trust––it’s active in childbirth, nursing, and, well…whatever. It is, after all, called the ‘love hormone’. My kvetch is with the capability of this study to discern the intended effect.

I say we need to publish analyses showing what’s wrong with the assumption that a given experiment is capable of distinguishing the “effects” of the “treatment” of interest. And what about those Likert scales! These aren’t exactly genuine measurements merely because they’re quantitative.

*It would be funny to look for a correlation between racy answers and tape.

Categories: junk science, rejected posts | 5 Comments

Fraudulent until proved innocent: Is this really the new “Bayesian Forensics”? (ii) (rejected post)

Objectivity 1: Will the Real Junk Science Please Stand Up?


I saw some tweets last night alluding to a technique for Bayesian forensics, the basis for which published papers are to be retracted: So far as I can tell, your paper is guilty of being fraudulent so long as the/a prior Bayesian belief in its fraudulence is higher than in its innocence. Klaassen (2015):

“An important principle in criminal court cases is ‘in dubio pro reo’, which means that in case of doubt the accused is favored. In science one might argue that the leading principle should be ‘in dubio pro scientia’, which should mean that in case of doubt a publication should be withdrawn. Within the framework of this paper this would imply that if the posterior odds in favor of hypothesis HF of fabrication equal at least 1, then the conclusion should be that HF is true.”june 2015 update J ForsterNow the definition of “evidential value” (supposedly, the likelihood ratio of fraud to innocent), called V, must be at least 1. So it follows that any paper for which the prior for fraudulence exceeds that of innocence, “should be rejected and disqualified scientifically. Keeping this in mind one wonders what a reasonable choice of the prior odds would be.”(Klaassen 2015)

Yes, one really does wonder!

“V ≥ 1. Consequently, within this framework there does not exist exculpatory evidence. This is reasonable since bad science cannot be compensated by very good science. It should be very good anyway.”

What? I thought the point of the computation was to determine if there is evidence for bad science. So unless it is a good measure of evidence for bad science, this remark makes no sense. Yet even the best case can be regarded as bad science simply because the prior odds in favor of fraud exceed 1. And there’s no guarantee this prior odds ratio is a reflection of the evidence, especially since if it had to be evidence-based, there would be no reason for it at all. (They admit the computation cannot distinguish between QRPs and fraud, by the way.) Since this post is not yet in shape for my regular blog, but I wanted to write down something, it’s here in my “rejected posts” site for now.

Added June 9: I realize this is being applied to the problematic case of Jens Forster, but the method should stand or fall on its own. I thought rather strong grounds for concluding manipulation were already given in the Forster case. (See Forster on my regular blog). Since that analysis could (presumably) distinguish fraud from QRPs, it was more informative than the best this method can do. Thus, the question arises as to why this additional and much shakier method is introduced. (By the way, Forster admitted to QRPs, as normally defined.) Perhaps it’s in order to call for a retraction of other papers that did not admit of the earlier, Fisherian criticisms. It may be little more than formally dressing up the suspicion we’d have in any papers by an author who has retracted one(?) in a similar area. The danger is that it will live a life of its own as a tool to be used more generally. Further, just because someone can treat a statistic “frequentistly” doesn’t place the analysis within any sanctioned frequentist or error statistical home. Including the priors, and even the non-exhaustive, (apparently) data-dependent hypotheses, takes it out of frequentist hypotheses testing. Additionally, this is being used as a decision making tool to “announce untrustworthiness” or “call for retractions”, not merely analyze warranted evidence.

Klaassen, C. A. J. (2015). Evidential value in ANOVA-regression results in scientific integrity studies. arXiv:1405.4540v2 [stat.ME]. Discussion of the Klaassen method on pubpeer review: https://pubpeer.com/publications/5439C6BFF5744F6F47A2E0E9456703

Categories: danger, junk science, rejected posts | Tags: | 40 Comments

Potti Update: “I suspect that we likely disagree with what constitutes validation” (Nevins and Potti)

PottiSo there was an internal whistleblower after all (despite denials by the Duke people involved): a med student Brad Perez. It’s in the Jan. 9, 2015 Cancer Letter. I haven’t studied this update yet, but thought I’d post the letter here on Rejected Posts. (Since my first post on Potti last May, I’ve received various e-mails and phone calls from people wanting to share the inside scoop, but I felt I should wait for some published item.)
          Here we have a great example of something I am increasingly seeing: Challenges to the scientific credentials of data analysis are dismissed as mere differences in “statistical philosophies” or as understandable disagreements about stringency of data validation.
         If so, then statistical philosophy is of crucial practical importance. While Potti and Nevins concur (with Perez) that data points in disagreement with their model are conveniently removed, they claim the cherry-picked data that do support their model give grounds for ignoring the anomalies. Since the model checks out in the cases it checks out, it is reasonable to ignore those annoying anomalous cases that refuse to get in line with their model. After all it’s only going to be the basis of your very own “personalized” cancer treatment!
Jan 9, 2015
 Extracts from their letter:
Nevins and Potti Respond To Perez’s Questions and Worries

Dear Brad,

We regret the fact that you have decided to terminate your fellowship in the group here and that your research experience did not tum out in a way that you found to be positive. We also appreciate your concerns about the nature of the work and the approaches taken to the problems. While we disagree with some of the measures you suggest should be taken to address the issues raised, we do recognize that there are some areas of the work that were less than perfect and need to be rectified.


 I suspect that we likely disagree with what constitutes validation.


We recognize that you are concerned about some of the methods used to develop predictors. As we have discussed, the reality is that there are often challenges in generating a predictor that necessitates trying various methods to explore the potential. Clearly, some instances arc very straightforward such as the pathway predictors since we have complete control of the characteristics of the training samples. But, other instances are not so clear and require various approaches to explore the potential of creating a useful signature including in some cases using information from initial cross validations to select samples. If that was all that was done in each instance, there is certainly a danger of overfitting and getting overly optimistic prediction results. We have tried in all instances to make use of independent samples for validation of which then puts the predictor to a real test. This has been done in most such cases but we do recognize that there are a few instances where there was no such opportunity. It was our judgment that since the methods used were essentially the same as in other cases that were validated, that it was then reasonable move forward. You clearly disagree and we respect that view but we do believe that our approach is reasonable as a method of investigation.

……We don’t ask you to condone an approach that you disagree with but do hope that you can understand that others might have a different point of view that is not necessarily wrong.

Finally, we would like to once again say that we regret this circumstance. We wish that this would have worked out differently but at this point, it is important to move forward.

Sincerely yours,

Joseph Nevins

Anil Potti

The Med Student’s Memo

Bradford Perez Submits His Research Concerns


Nevins and Potti Respond To Perez’s Questions and Worries


A Timeline of The Duke Scandal


The Cancer Letter’s Previous Coverage



I’ll put this up in my regular blog shortly

Categories: junk science, Potti and Duke controversy | 1 Comment

“No shame” psychics keep their predictions vague

imagesFor some reason, science debunker Goldacre’s blogpost below makes me take him slightly less seriously. It’s as if he’s saying, it’s no shame in giving psychic pronouncments to parents with missing children–people who obviously might be devastated or misled as a result–so long as you’re not found wrong. Does anyone else see it this way?

Shame on you, Sylvia Browne, for telling Amanda Berry’s mother her daughter was dead.

May 7th, 2013 by Ben Goldacre in just a blog 
The story of Amanda Berry’s rescue in Cleveland – after ten years in captivity – is extraordinary. In 2004, popular psychic Sylvia Brown told Amanda’s mother that her little girl was dead. Here is a contemporaneous account of that show.
Amanda Berry’s mother traveled to New York to tell her story to Psychic Sylvia Browne on the Montel Williams Show. The show was a shot at getting her daughter’s picture before the eyes of millions of Americans. “On April 21st 2003, 16-year-old Amanda Berry left her part-time job never to be seen again,” the show began. With that, TV viewers across America now know a girl from Cleveland is missing. But Amanda Berry’s mom wanted more than her daughter’s picture on national TV. She wants answers. “Can you tell me…Is she out there?” Berry’s mother Louwana Miller asked. “I hate when they’re in the water,” Browne said. “She’s not alive honey.” It was bad news from the world-renowned psychic. It’s what Miller didn’t want to hear. “So you don’t think I’ll ever see her again,” Miller said. “Yeah in Heaven on the other side,” Browne responded. “I’m sorry.” Montel took a commercial break and Amanda’s mom broke down.
It has been widely reported in the last 24 hours that Amanda Berry’s mother died in 2006 of a broken heart: certainly she must have endured appalling anguish over her last years. It would be nice if people like Sylvia Browne could deliver their stage entertainment with a bit more consideration. Until hell freezes over, we can at least draw attention to these horrible episodes.
Given that fortune-telling (on TV or live) lacks any scientific validity whatsoever (and the FBI said Browne had never been of any help), what can Goldacre’s chiding mean, except perhaps to suggest that an honorable psychic should be sure to keep her predictions ultra vague? What if she’d predicted her daughter was alive, or what if her daughter had turned out to be dead, would Goldacre declare there was no shame?
Categories: junk science, Misc Kvetching | Leave a comment

Blog at WordPress.com.