I recently saw an article by an astute reporter that described one of our colleagues as a researcher who “…has made a career out of finding data….”
Finding data.
What a lush expression. In this case, as it seems always to be the case, the researcher had a knack for finding data that supported his or her theory. On the positive side of the ledger, “finding data” denotes the intrepid explorer who discovers a hidden oasis or the wonder that comes with a NASA probe that unlocks long lost secrets on Mars.
On the negative side of the ledger, “finding data” alludes to researchers who will hunt down findings that confirm their theories and ignore data that do not. I remember coming across this phenomenon for the first time as a graduate student, when a faculty member asked whether any of us could “find some data to support X”. I thought it was an odd request. I thought in science one tested ideas rather than hunted down confirming data and ignored disconfirming data.
Of course, “finding data” is an all too common practice in psychology. Given the fact that 92% of our published findings are statistically significant and that it is common practice to suppress null findings, it strikes me that the enterprise of psychological science has defaulted to the task of finding data. One needs only to have an idea, say that ESP is real, and, given enough time and effort, the supporting data can be found. Given our typical power (50%) and the Type 1 error rate (at least 50% according to some), the job is not too tough. One only has to run a few underpowered studies, with a few questionable research practices thrown in and the data will be found. Of course, you will have to ignore the null findings. But, that apparently is easy to do because as one of our esteemed colleagues wrote recently “everyone does it”—“it” meaning throw null effects away.
There are other careers and jobs that call for a similar approach—pundits and lawyers. The job of Fox or MSNBC pundits is not to report the data as it is, but to find the data that supports their preconceived notion of how the world works. Similarly, good lawyers don’t necessarily seek the truth, but rather the data that benefits their client the most. It appears that we have become a field of lawyers who diligently defend our clients, which happen to be our ideas.
To the extent that this portrait is true, it leads to some painful implications. Are psychological researchers just poorly paid lawyers? I mean, most of us didn’t get into this career for the money, but if we are going to do soulless lawyer-like work, why not make the same dough as the corporate lawyers do? Of course, given our value system psychologists would most likely be public defenders so maybe asking for more money would be wrong. But consider the fact that law school only lasts three years. The current timeline for a psychology Ph.D. seems to be five years minimum, sometimes six, with post doc years to boot. Do you mean to tell me that I could have simply gone to law school instead of a Ph.D. program and been done in half the time and compensated far better? Maybe it is not too late to switch.
What’s so bad about being a lawyer?
Nothing. Really. I have no prejudice against lawyers. Practicing law can be noble and rewarding. And, like many careers it can be a complete drag. It is work after all.
And, there are similarities between science and law. Both professions and the professionals therein pursue certain ideas, often relentlessly. Many defendants are grateful for the relentless pursuit of justice practiced by their lawyers. Similarly, many ideas in science would not have been discovered without herculean, single-minded focus, combined with dogmatic persistence.
Then again, there are the lawyers who defend mob bosses, tobacco firms, or Big Oil. None of us would want to be like them, right?
In an ideal world, there is one, very large difference between practicing law and science. At some point, scientists are supposed to use data as the arbiter of truth. That is to say, at some point we must not only entertain the possibility that our all-consuming idea is wrong, but also firmly conclude that it is incorrect. I had an economist friend who pursued the idea that affirmative action programs were economically detrimental to beneficiaries of those programs. He eventually determined that his idea was wrong. Admittedly, it took him ten years to come to that position, but he at least admitted it. Changing one’s mind like this would be akin to the tobacco lawyer suddenly admitting in the middle of a trial that smoking cigarettes really is bad for you. This doesn’t happen exactly because these lawyers are paid big money to ignore the truth and defend their clients despite these truths.
This means that the difference between being an advocate and a scientist lies almost solely on the integrity of our data and our response to that data. If our data are flawed, then we can act like scientists and really be no better than a pundit or propagandist. If we hide our “bad” data (e.g., non-significant findings), we are likewise practicing a less than noble form of law—we are ambulance chasers or tobacco lawyers. If we don’t change our minds as a result of data that disconfirms our most closely held ideas, we are again, advocates not scientists.
The bottom line is that many of us are being lawyers/pundits with our research. We drop studies, ignore problematic data, squeeze numbers out of analyses, and use a variety of techniques in order to present the best possible case for our idea. This is the fundamental problem with the p-hacking craze going on in many sciences, including psychology. We are not truly testing ideas but advocating for them, and often we are really advocating only for our careers when we do this. Just because we defend seemingly noble ideas, such as social justice, doesn’t make the work any different. If we only pay attention to the data that supports our client, then we aren’t doing science.
What should we do?
Many, many earnest recommendations have been made to date and I will not reiterate or contradict any of the missives describing optimal publishing practices and the like. What I think has been missing from the dialogue is a clear case made for us to change our attitudes, not only our publishing practices and research behavior. So, the recommendations below go to that effect.
First, and most ironically, I believe we need to be legalistic in our approach to our research. That is, we need to be judge, jury, prosecutor, and defense council of our own ideas. As noted elsewhere, psychology is a field that only confirms ideas (and only in data that reveals a statistically significant finding). Alternatively, we need to do more to prosecute our own research before we hoist it on the world. The economists call this testing the robustness of a finding. Instead of just portraying the optimal finding (e.g., the statistically significant one), we need to present what happens when we put our own finding to the test. What happens to our finding when statistical assumptions are relaxed or restricted, when different control variables are included, or different dependent variables are predicted? If your finding falls apart when you conduct a slightly different statistical approach, use a new DV that correlates .75 with your preferred DV, or run the study in a sample of Maine versus Massachusetts undergraduates, do we really want to endorse that finding as true? No.
Second, we need value direct replication. I get a lot of push back on this argument, and that push back deserves its own essay (later Chris). But, given how prevalent p-hacking is in our field, we need an outbreak of direct replications and healthy skepticism of “conceptual” replications. For example, those who argue that they value and would prefer a 4-study paper with 3 conceptual replications, have to assume that p-hacking is not prevalent. Unfortunately, p-hacking is wide-spread (see quote about “everyone does it”). At this juncture, a 4-study paper with 3 conceptual replications using some perversely nonsensical range of sample sizes for each study (from 30 to 300) screams out “I P-HACKED!” Combining conceptual replications with simple, direct replications is not difficult and is really hard to argue against in light of how difficult it is to replicate our findings.
Third, we need to walk back our endorsement and valuing of brief journal formats found in journals like Science, Psychological Science, and Social Psychological and Personality Science. This is not because short reports are evil per se, but because they promote a lax attitude toward research that exacerbates our problematic methodological peccadillos. I must admit that I used to believe that we needed more outlets for our science and I loved the short report format. I was wrong. We made a huge mistake—and I was part of that mistake—in promoting quick reports with formats so brief and review processes so quick that we end up promoting bad research practices. At JPSP after all, you have to “find” 3 or 4 statistically significant effects to have a chance at publication. At short report journal outlets, you only have to “find” one such study to get published, especially if the finding is “breathtaking.” Thus, we promote even less reliable research in top journals in an effort to garner better press. In some ideal world, these formats would not be a problem. In the context of pervasive p-hacking, short, often single-shot studies are a problem. We have inadvertently promoted a “quick-and-dirty” attitude toward our research efforts, making it even easier to infuse our field with unreliable findings. Until we have our methodological house in order, we should reconsider our love of the short report and the short report outlet.
Fourth, we need to be less enamored with numbers and more impressed with quality. Building a lengthy CV is not that difficult. All one needs to do is put together a highly motivated team of graduate and undergraduate assistants to churn through dozens of studies per year. Then, combine that type of cartel with a willingness to ignore the null effects or practice some basic QRPs and you will have at least 4 JPSP/Psych Science-like multiple study papers completed per year. If you are willing to work the “messier” studies in lower-tier journals you are well on your way to an award. Even better, publish unreplicable, provocative findings and get into a nasty argument with colleagues about your findings. Then, your CV explodes with the profusion of tit-for-tat publications that come with the controversy. In contrast, if we evaluate researchers based on the ideas they have and how they go about testing them, rather than their ability to churn the system to discover statistical significance, we might actually do more to clean up our methodological mess than any pre-registration registry could ever achieve.
The obvious ramification of adopting a more skeptical attitude toward our own research would be to slow things down. As Michael Kraus has argued, why not adopt a “slow research” movement akin to the slow food movement? If rumors are true, over half of our research findings cannot be directly replicated. That means we are wasting a lot of time and energy writing, reviewing, reading, and believing arguments that are, well, just that, arguments—arguments that look like they have supporting data, but are really fiction. While I appreciate a good argument and impassioned punditry, science is not supposed to be an opinion echo chamber. It is supposed to be a field dedicated to creating knowledge. Unlike a baseless argument, knowledge stands up to cross-examination and direct replication.
Brent W. Roberts