On Thursday, I wrote an article about a firestorm in the field of psychology.
Susan Fiske, a Princeton University social psychologist and former president of the Association for Psychological Science (APS), had written a column for the newsletter APS Observer arguing that there was a serious problem of anonymous, ad-hominem attacks among researchers in the field. She accused psychologists, who she did not name, of what she termed “methodological terrorism.”
The column leaked online ahead of its scheduled publication, and sparked a firestorm of ridicule and critique.
Many researchers say that there’s a “replication crisis” in the field of psychology, with many prominent results based on faulty statistics. An algorithm, known as “statcheck,” is trawling through old psychology papers looking for errors and posting them publicly. A significant portion of interested people on social media interpreted Fiske’s letter and her accusations as a response to the discussion around the replication crisis.
Andrew Gelman, a Columbia University statistician and political scientist who has advanced some of the statistical criticisms driving the debate, wrote a long post on his blog in which he detailed the history of the crisis. He maintained that Fiske exists within a “dead paradigm” of statistically problematic psychology, and that her incentive is to deny systemic problems in order to protect her position in the research establishment.
I included Gelman’s post in my coverage. But Fiske was preparing to fly back to the United States from a conference in Germany at the time I was writing, and was not available for comment. However, she made time Friday morning to speak with me by telephone.
A note: Due to the direct nature of some of the specific claims Fiske made in the course of our conversation, we offered Gelman a chance to respond to excerpts from this transcript before publication. You can find his response, emailed to Business Insider Sunday evening, at the bottom of this post.
What follows is a transcript of my conversation with Fiske, edited minimally for length and clarity.
Rafi Letzter: Thank you so much for taking the time to speak with me. I wonder if you could start by talking me through the context for your column in APS Observer?
Susan Fiske: I was invited by the current president of the APS. She had heard, with some shock, from some of the people who have been particularly harassed. And so she asked me if I would write about it. I was reluctant, because I knew I would be putting myself on the line. But I decided that somebody needs to speak for the people who are too afraid to speak.
I have not been personally targeted that much, but I have had conversations with, I would say, 20 or 30 people saying that they feel that they’ve been singled out and harassed. This is not simply peer review, or post-publication peer review, because we all agree to participate in that process as scientists. These are ad-hominem attacks.
RL: Can you clarify exactly what you’re referring to?
SF: So as an example, Andrew Gelman’s timeline of changes in open science is very useful. And that’s his take on what’s happening, and I think it’s a pretty good summary of what’s going on. The problem is where he accuses me of having published statistically faulty research.
I’ve published 350 articles and chapters. Let’s say half at least are statistical. He identified one correction to, basically, a sub-analysis. But the overall analysis was still intact. When this was pointed out to us, we issued a correction. But it didn’t change the conclusions of the paper, and the overall analysis was still significant.
What I knew would happen when I published this column was that I would become a target. I don’t think that’s scientifically motivated.
RL: What do you think does motivate it?
SF: Well I’m not going to speculate, because that’s exactly the kind of problem that we’re having. In these unfiltered, un-moderated social media posts people are speculating about others’ motivations. And you could no more put that in a peer review for a journal than you could fly.
The editor would jump down your throat for speculating about why somebody is coming to the conclusions they’re coming to. It’s not respectful.
And then he brings in my editing choices from my time as an editor at the journal PNAS. That’s not relevant. Editors make choices. Editors are confined by peer review. I’m confined by peer review feedback. And I publish things that in my judgment are good science.
So to connect my column to my work at PNAS is not relevant. And it’s an attempt to smear me.
RL: The argument he makes is that there’s an incentive structure for people such as yourself, with distinguished careers in psychology before the replication crisis, to protect a degree of institution prestige.
SF: I think that’s outside the bounds of professional behavior. You can say that about anybody. And there’s no proof.
It’s not really relevant to the quality of people’s arguments and the quality of their science. So I think speculating about people’s motivations and their place in the power structure – you know, he doesn’t know me. He doesn’t know what my career has been like. And he has no right to make these speculations, and it’s not even scientifically germane.
RL: What do you think a fair and respectful way to check the statistics of old papers would be?
SF: “Fair” and “respectful” are the key terms. You know, if people are going to do post-publication peer review, they need to abide by the same rules as they abide by for pre-publication peer review: not being ad hominem, being respectful, giving the author a chance to respond in a reasonable way.
Some people have set up sort of “gotcha” algorithms that apparently crawl through psychology articles and look for fraudulent p-values [a measure of the likelihood that experimental results weren’t a fluke]. But they’re including rounding errors that don’t change the significance levels of the results, and they’re doing it anonymously.
RL: Do you think that the claims of a “replication crisis” in psychology are authentic?
SF: I was not writing about the replication crisis. That’s a different column. I was writing about the behavior of people who post comments about their colleagues that would not be tolerated if there were an editor or some other moderator paying attention.
So, the replication issue is complicated. But I wasn’t writing about that.
RL: I’ve seen conversations online among psychologists who feel that they have been personally attacked by the language that you used in the letter. Phrases like ‘methodological terrorism,’ ‘antagonism,’ ‘self-appointed data police,’ and ‘vigilantes.’ The argument’s been made that that was itself beyond the pale of respectful discourse in a psychological journal.
SF: Well, I think people have focused in on that one phrase, “methodological terrorism,” and not attended to the argument that I was making. That’s unfortunate. It’s become a lighting rod.
I had three audiences in mind when I wrote this column. One was the people who feel bullied, cyber-bullied. I wanted them to know that somebody knows they were being cyber-bullied, somebody who was willing to go public and describe the phenomenon. The second audience was people who are not on social media who don’t know that this is going on. And then the third group of people are the people who are doing this, and in my view might want to think about changing the norms of scientific discourse to be more respectful.
So I used provocative language on purpose to get people’s attention. But I would defend the conceptual basis for that wording.
RL: Do you believe the response to your column itself is bullying?
SF: I think the hostility toward me is an example of the phenomenon I’m talking about.
It’s one thing to disagree with somebody and to argue with their arguments. I’m not on social media that much, but I have seen less counter-arguing the points I was making and more objecting to the language that I used, and hostility toward me as a person and speculation about my motives for doing this.
RL: I want to make sure that I clearly understand your answer on this. There are some people who say say the language that you used was itself representative of the sort of behavior you were critiquing.
SF: The difference is I didn’t name names. I was talking about norms. And I think there’s a huge difference between describing norms in a vivid way and singling out individual people.
I’ve had people drag my family into their comments about my motives. I’ve had people drag my advisees in, when they’re not relevant. There really seems to be no boundary.
RL: What would you hope would come out of this conversation?
SF: I would like to see more moderated forums. There are some that are moderated, and if people start to flame the moderator intervenes. And I think that’s important.
I think there needs somebody who needs to be paying attention to the tone of the discussion. For individual peoples’ blog posts, I think they have to be monitoring themselves. And, you know, I hope they’ll think twice before they single people out for scientifically-irrelevant attacks and speculations about motives.
If you are doing a peer review of somebody’s paper before publication, the editor would not allow you to speculate about the person’s motives, about their place in the hierarchy. It’s not scientifically relevant.
RL: What do you think qualifies someone to assume the role of moderator?
SF: It’s the same question that comes up about who’s an adequate journal editor. There ought to be some democratic consent in this process. But that’s getting way beyond anything I could really be specific about.
RL: If there is a systemic problem in the methodology of psychological research, do you think the journals are adequate to the task of addressing it?
SF: Well, first of all, you’re proposing whether there’s a systemic problem or not.
RL: Do you think there is a systemic problem?
F: I think that there have been some helpful reminders in the current discussion, quote, “crisis.” Speaking for myself, I was brought up in graduate school to do power analyses, and report effect sizes. And to look at meta-analysis to see the overall pattern of effects, and to replicate my own work before publishing it.
So, you know, many of these messages are not entirely new. But I think it’s a helpful reminder that these are important practices.
I think the discussion of the statistical principles is helpful. I don’t think going after people publicly is helpful. One could write to somebody and say, “I think there’s an error in your paper.”
RL: Do you think, broadly, researchers are receptive to that kind of communication?
SF: Yes, I do. I think most people want to get it right.
If you have an effect that nobody can replicate, then your phenomenon fades away. So if you want to to have a legacy, then you jolly well better have an effect that replicates.
RL: And, is there anything I haven’t asked about that you’d like to say or you think readers should know?
No, I don’t think so. I think you should read the petition titled, “Promoting open, critical, civil, and inclusive scientific discourse in Psychology.” It’s very thoughtful. I don’t agree with every single sentence in it. But it’s pretty good.
I was not involved in writing it. They showed it to me in the beginning and they informed me at the end. And I think a variety of people who don’t totally agree with one another have signed it. Promoting open and critical and respectful scientific discourse seems like a pretty good goal to me.
RL: Thank you so much for taking the time to speak with me after what I’m sure has been a difficult week. I hope you get some chance to relax today.
SF: I don’t have a chance to relax. But I have gotten dozens of emails of support over the last few days, so. That’s helpful to survive this. I knew it was going to happen, so it sort of proves my point.
Again, in the interest of fairness, we sent Gelman excerpts from this conversation that directly criticized him. His response, sent to Business Insider in an email, is below. We cut a a few examples he cited of other psychologists whose work has come under fire, inserted two links (for context), and added additional paragraph breaks; otherwise, this is Gelman’s response, verbatim:
I find it challenging to respond to Fiske’s writing on this topic because we are coming from such different places. She talks about “methodological terrorists,” “ad hominem attacks,” and “smears,” but from my perspective, I just want to help people do better research.
Now, it turns out – and it’s only in the past few years that I and many others have realized this – that a lot of published research papers are just hopeless. Not just an omitted variable here or a miscalculated t-statistic there, but, more fundamentally, studies that really have no chance at getting at what they’re aiming for. This is a matter of scientific judgment, but I’m not the only one who has this view, and some support for this perspective is lent by a series of failed replications of high-profile publications.
Anyway, the challenge is that if someone does a study which, for statistical reasons, I think is hopelessly underpowered or nonidentified, my best and most useful advice will not be tips on how to calculate p-values better, or how to construct an explanation for some particular data pattern. Rather, my advice will be to start over, to reconsider what you think you already know, maybe to question some prominent work in your subfield, and quite possibly to think a lot harder about measurement, and about the relation of your data to your underlying constructs of interest. My criticism will be firm, it will go to the fundamentals, and I won’t be shy about saying that I don’t think your p-values say anything useful at all.
And the thing is, this sort of firm criticism can be hard to take. I’m sad that Fiske seems to consider this criticism to be terrorism, and I’m not trying to smear anyone nor do I consider it an ad hominem attack to point out mistakes in published work. But I do understand from her reaction that this has been a difficult time for her and some of her friends and colleagues, and I have no desire to cause her discomfort, beyond the necessary discomfort of having to reassess one’s work. These problems are not unique to Fiske, not at all. As I wrote in my above post, as recently as 5 or 10 years ago, almost all of us were routinely trusting the results of published studies. That was the whole point of my post, that things have changed and it can be hard to adjust.
But the more relevant point is that I am very happy with the trends in research communication in psychology and in science more generally. As recently as a few years ago, researchers and journalists would just assume that articles in top journals were correct. But a series of papers on ESP, himmicanes, air rage, the contagion of obesity, beauty and sex ratio, etc etc etc, have made us appropriately wary when we come across flashy claims, even if such claims are attached to statistically significant p-values and published in prestigious outlets such as the Proceedings of the National Academy of Sciences or the Lancet.
Meanwhile … well-meaning researchers can routinely find statistical significance even from pure noise, and the careful replications performed by Brian Nosek, Eva Ranehill, and many others have made us aware that these concerns are not merely theoretical. Authors of published papers and editors of scientific journals can, unfortunately, be slow to come to terms with criticism, and it’s good that we can use blogs to express specific criticisms of published articles and to use social media to disseminate these criticisms.
I have no doubt that Susan Fiske and her colleagues are deeply committed to research progress in social psychology – I say this in complete sincerity – and it’s my impression that the field of psychology is in better shape than ever to allow this to happen.