It has been a bad week for cycling, with Lance Armstrong’s tacit acknowledgement that he cheated his way to 7 Tour de France titles. As an avid recreational cyclist who did his own turn at Lance Armstrong hero-worshiping, I am admittedly saddened by the final turn in the long sordid tale of cheating in professional cycling.
Similarly, it has been a bad summer for social psychology, with 2 more cheating scandals of our own that followed closely on the heels of the Stapel affair. In pondering the two fields it suddenly struck me that we are living weird parallel lives. Like cycling, many of the best and brightest researchers have been cheating their way to fame and fortune in psychology.
EPO vs SEA
Social Psychology is actually worse than cycling in two ways. There was no fraud in cycling. Lance Armstrong never faked a ride up the L’alpe d’huez or created a hologram version of himself that he magically rode to impossible speeds in cycling time trials. No, Lance, like many cyclists in his cohort, apparently cheated by doing things to enhance his performance. But he still had to put out physical effort that few of us could fathom in order to make it to the top of the hill faster than his competitors. Who, like Lance, were most likely cheating too. The drugs of choice in cycling appeared to be EPO, or blood doping, or some variant of testosterone, all of which made riders stronger, faster, and better able to recuperate from their labors.
In contrast, in our field, we’ve had outright fraud in addition to cheating. Stapel simply made up data. It is easy to condemn frauds, and there is reason to believe outright fraud is rare. A more nefarious problem is that many of us cheat, just like the cyclists.
Our drug of choice is sampling error abuse (SEA). As outlined so well in documents like Simmons et al (2011), John et al (2012), and Francis (2011), we conduct research in such a way that it is easy to manufacture statistically significant findings. With some diligence and effort, but nothing herculean, we can produce what appears to be a stage win in our Tour de Psychologie—a series of statistically significant findings across many different types of studies—simply by running a modest number of underpowered studies and throwing out or ignoring the null findings—sampling error abuse. It is remarkably easy to do, and it is cheating.
The scary thing, and the second way we are different from cycling, is how we convince ourselves that it is okay to abuse sampling error. The modal reaction to the criticisms of our field has been a great big yawn. “This will pass” has been a common refrain. Or, others make remarkable rationalizations of their actions that they actual believe. My favorite rationalizations can be composited like this: “The finding is very delicate” and requires “years of specialized training” and “fine tuning of the experimental situation” in order to find the effect. These are code words for sampling error abuse. Run enough low powered studies with multiple sets of similar outcomes until you find enough hits to package them into the commodity that we like—a package of statistically significant findings. Again, what is most important to note is how many of us believe our own rationalizations. So, not only do we cheat, but unlike the cyclists, we don’t even think we are cheating.
The excuses and rationalizations are the same
When caught, we come up with excuses that are no better than the cyclists. Many of the top cyclists maintained that everyone cheated so they were just trying to keep up with the peloton. Likewise, Smeesters claimed he was doing nothing different than what his peers did. Everyone does it? Of course, in cycling they may not award wins to anyone in the Tour de France editions that Lance Armstrong won because so many of his competitors have already been caught cheating themselves. No-one, it seems, was clean.
The same argument can be made about sampling error abuse. It is not only the modal way we do research, but some of our most treasured documents even promote it (e.g., Bem, 2000).
A similar line of reasoning lies behind the second variant of the “everyone is doing it” refrain. All too often I’ve heard colleagues say that social psychology is not alone, and that many other fields have just as many integrity problems, if not more than us. Really? We should take pride in the fact that other fields cheat too? It is like taking solace in having less leprosy than others. And, whether we want to admit it or not, the critical mass of problems that have emerged in the last year in our field makes us the poster children of bad methods.
One may argue that winning a cycling event hurts no one, therefore cheating at cycling hurts no one. Is that how science works too? If my lab mate or my colleague cheats, is it okay for me to cheat also? I thought we handled this issue well in third grade—just because Johnny hits someone does not make it okay to punch someone else.
In science, we do hurt people if we produce false findings. To start, we mis-educate thousands of students across the globe as our results find their way into textbooks as facts. We also advise government agencies to do things that are not sound. Just recently, a colleague proudly posted the fact that the President of the United States was given a brief based largely on a program of research from our guild that has been called into question even before data integrity issues have been raised. How proud should we be when unreliable findings are spoken as truth to people in power?
We do it for the same reason: Status, fame, and fortune
The reasons cyclists like Lance Armstrong might cheat are transparent. They net fame and fortune for themselves and their team. Success brings endorsements, contracts, and the fame that continues even after one’s career is over.
Psychologists cheat for the same reason, albeit for much less fame and fortune. A series of JPSP articles places you on the map in social psychology and gets you a job. A few more articles in top journals, like Psychological Science or, even better Science, and you might actually get genuinely famous, at least in intellectual circles, and tenure at least. Eventually, all of this success results in a job at a top research university, with the perquisites of highly capable graduate students, post docs, and colleagues. All in all, being a successful academic is not a bad life.
It is a team effort
Professional cycling, like much of modern psychological science, is a team sport. There is, of course, the lead rider like Lance Armstrong (e.g., faculty), who is the dedicated team leader and for whom everyone else rides. There are specialists, like sprinters or climbers (e.g., colleagues and post docs), who have specific skills that often help the team, but in the end are subordinated to the goals of their team leader. Then there are the domestiques who protect the lead rider in the peloton, carry extra waters for the lead rider, ferry the lead rider back to the peloton when he crashes, and even give up their bike to the leader if necessary (e.g., graduate students).
The team component of cycling was apparently Lance Armstrong’s downfall as many past members of his teams finally confessed to helping him cheat. Similarly, scholars like Stapel were caught in part because of graduate students reporting research anomalies. And, like many of Stapel’s students and colleagues, presumably some on Armstrong’s team were not complicit with his doping, but by sheer association with him suffer because of his cheating. Unfortunately, in both cycling and psychology the associates of the cheater suffer serious career consequences that are presently difficult to quantify.
Like the cyclist, psychologists could become better team members in order to curtail cheating. We could all take a turn looking over the data contained in our collaborative efforts. Faculty and students could run analyses together, and eventually, simply publish our data along with our papers so others can double check our work. Or, we could be entirely radical and directly replicate our research.
That said the team structure makes cheating infinitely more complex. What the team leader like Armstrong or a faculty member has to give to specialists and domestiques is status, fame, job security, and all of the things one might want in a career. Domestiques are future team leaders, just as graduate students are future faculty. Turning in your leader means ending your career as you know it. Why would someone knowingly do that? For that matter, why would some other team leader hire a whistle blower?
Couple the power exchange between faculty and graduate students with the peer group structure of the research team and things get quite sticky. A loyal domestique, like a loyal graduate student gets the rewards. I cannot fathom the life of a graduate student in a research lab that knowingly practices systematic sampling error abuse. Should graduate students who realize that these practices are cheating pull the plug on their respective operations? Maybe in a fit of moral integrity they choose to only do replicable research. How then do they compete with other students who continue to abuse sampling error? How do they avoid being considered a “disappointment” or a “failure” in the eyes of their advisors when their peers are producing so many more “positive” results (e.g., statistically significant) and therefore publishable papers? I cannot fathom the pressure they are under.
Who is our USADA?
Cycling has several governing bodies that monitor their sport, like the United States Anti-Doping Agency (USADA). Few people sing the praises of these policing organizations and the people who are pursued by them characterize their actions as “witch hunts”—a term that has been used to describe some of the efforts in psychological science that are currently being used to catch cheaters in our field.
We don’t have an official group like the USADA to keep tabs on our activities. Instead, we have an informal group of self-motivated scholars who have taken on the mantle of responsibility to improve our methods. Like the USADA, the people leading the charge to clean up our science are vilified. In particular, people like Hal Pashler, Brian Nosek, and Greg Francis have been described as “crazy”, or trying to “destroy social psychology.” Are Hal Pashler, Brian Nosek, or Greg Francis trying to destroy social psychology? No, they are not. First, taking down specific research projects does not equate to taking down an entire field. Second, like the USADA in cycling, they are actually trying to save the field. Pashler and others like Brian Nosek, Uri Simonsohn, Uli Schimmack, and Greg Francis would like our science to have integrity and for everyone to practice good research hygiene so that our field can produce knowledge that is usable. Accumulating unreplicable findings, like cheating in cycling, may promote a career, but it actually undermines a field.
But Lance Armstrong has done so much good, and so has Social Psychology
Lance Armstrong’s legacy will be decidedly mixed. He may have cheated his way to wins, but also he has started a highly successful charity, Live Strong, that has grown much bigger than the man himself.
Similarly, social psychology focuses on many laudable issues. We study how people are marginalized, oppressed, and persuaded inappropriately. Our research is often explicitly prosocial. One of the conspicuous features of Stapel’s fraudulent research was how benignly consistent it was with the accepted prosocial value system that social psychology represents. He, essentially, took advantage of our desire to do good.
One problem with the way sampling error abuse works is that we can eventually find support for any position we want, whether true or not. The danger then becomes that we double down on our research even when we cheat because the ends justify the means. At this juncture we need to ask ourselves how much of our research shows what we want to believe rather than what we should believe.
Are we going to be Lance Armstrong or Tyler Hamilton?
Lance Armstrong has, to this date, admitted to nothing. He has simply stopped fighting the allegations, and in so doing tacitly admitted to their truth. Armstrong was not the first to fall in professional cycling. Countless others preceded him. Many, like Tyler Hamilton, openly confessed to their past doping, apologized and returned their rewards, such as Olympic gold medals.
Several of the psychologists caught so far for committing fraud in psychology have also admitted to their deeds and resigned much like Tyler Hamilton. But what of the more common form of cheating, sampling error abuse? What should researchers do when confronted with the evidence that they p-hacked their way to a publication or that there were anomalies in the data that made them “incredible”? Will we be like Lance and fight it to the last, or be like Tyler and admit it and move on? One would hope that we would choose Tyler Hamilton’s approach for both its nobility and opportunity for redemption.
As scientists, we have good reasons to admit to past indiscretions. Unlike professional cycling, science is intrinsically a sport in which we are supposed to fail. We are never completely correct in our efforts, even when we find reliable findings. The truth we arrive at is provisional at best. As scientists, we are supposed to be open to being wrong and to correcting our errors. Past studies that relied on sampling error abuses could simply be construed as errors. Errors that now need be corrected with more science—ideally, science done well.
Brent W. Roberts