Episode overview
In a post-pandemic era and with AI on the rise, restoring the public’s trust in science is critical—but what will it take to do so?
In this episode of Ideas Matter, Sandro Galea, dean of WashU’s Bursky School of Public Health, and Stanford Medicine’s John Ioannidis discuss his famous paper, “Why Most Published Research Findings Are False,” and explore how science can do better.
Transcript
[Sandro Galea]
Welcome to Ideas Matter, a podcast hosted by WashU. I am Sandro Galea, vice provost of interdisciplinary initiatives and dean of the School of Public Health. In 2005, Professor John Ioannidis, Professor at the Stanford School of Medicine, published a paper in PLOS Medicine titled, “Why Most Published Research Findings Are False.” That paper, one of the most downloaded papers ever in the history of science, argued that the result of many, if not most, published medical research papers cannot be replicated.
Reproducibility is core to the scientific methods. If results cannot be reliably replicated, it calls into question the process by which they are generated, the validity of science itself. The replication crisis, which 20 years after Professor Ioannidis’ paper remains unsolved, is not the only headwind facing science today. Public trust in science has fallen since the pandemic. Scientific institutions face disinvestment and political pushback. And all of this has created something of an identity crisis for science.
How can we ensure the work of science rests on sound methods and that its results can be trusted? How can we rebuild the public’s confidence in the work of science? Here to help answer these questions is the man who played a key role in surfacing them. Professor John Ioannidis is a professor of medicine at the Stanford Prevention Research Center, a professor of epidemiology and population health at Stanford, and a professor in Stanford’s Department of Biomedical Data Science. I am pleased to be speaking with him today. John, welcome.
[John Ioannidis]
Thank you so much for the kind invitation, Sandro.
[Sandro Galea]
So let’s start with that paper. I mean you’ve written a lot of papers, but that paper. I’m sure you being asked about it all the time, “Why Most Published Research Findings Are False.” So what led you to the research that resulted in the paper? Walk us through why you wrote the paper and its findings and its implications.
[John Ioannidis]
That paper arose probably organically after having a track record of about a dozen years publishing research papers myself, trying to do research in different areas, being involved in the growth of the evidence-based medicine movement and communicating with many colleagues and seeing the work of many colleagues in the area of trying to understand evidence and its biases.
The constant realization was that even though there was excitement, that science is very powerful and evidence can be very useful, especially in medicine and public health. Biases, errors, problems with reliability were very very common. And in my own experience, in my own work, I just could not count the opportunities for errors and mistakes that arose in what I was trying to do, some of which perhaps I could realize and some others probably I could not because I could just not be cognizant of them. So the paper tried to organize a concept, through a mathematical approach, of what is the chance of getting a discovery, a new discovery for some effect, for some relationship to be true, to be non-null when it comes out based on different features. So these features included the presence of bias, how many teams are trying to get to that eureka moment, and also of course, how big the studies are and in a way what the power of the studies would be that are chasing that discovery.
There’s many forces that shape both the design and the execution and then the reporting of the results. So factoring hopefully reasonable assumptions about the range of these values in different fields, my conclusion was that most settings, most of these eureka moments, probably would be red herrings. They would represent situations where we think that we have found something, but actually this is not there.
And this was congruent with the empirical evidence that both myself and many other people working in evidence-based medicine or in other fields had accumulated based on our frustration to replicate results, to see something be trustworthy across multiple studies that would not be just a reflection of bias. Of course, there’s also tremendous heterogeneity based on how different features of designing, executing, and reporting research would be in different fields. So some types of studies and some types of fields, my paper suggested, would have much higher credibility compared to others that would be much lower.
[Sandro Galea]
So let’s talk a little bit about your findings. To what extent do you think this challenge with replication is a statistical misunderstanding versus deeper structural forces shaping what we do in academia?
[John Ioannidis]
I think it is both. Clearly there’s a lot of misunderstanding about statistical methods, a lot of misuse, a lot of misinterpretation, and there’s many reasons for that. I think that statistics is a complex science and it is applied very widely in situations where the authors, the readers, the editors, the reviewers probably don’t understand the exact implications and what the results might mean in terms of how they would translate them to some rating of credibility of the result. But there’s also very major structural problems, some of them leading to that misuse and misunderstanding of statistics. But the problems go way beyond statistics. They go into incentives, they go into their reward structure, they go into how we evaluate science, they go on to how we, one, structure the scientific workforce, how we get recruited, promoted, tenured, how we’re given funding, who is doing science, is it conflicted or unconflicted individuals and organizations and companies, private sector versus public. So the scientific ecosystem is very complex. It is rapidly evolving. It is trying to accommodate a lot of shocks from the evolution of our society and technology many of these shocks leave some footprint on the credibility of science that is being done and the credibility of science that is being reported.
[Sandro Galea]
Trying to learn from, let’s say, success, which fields in your view have made the most genuine progress towards improving reproducibility and can we, other disciplines, learn from them?
[John Ioannidis]
I think that many fields have recognized many of the problems and have responded very positively to them. One example that I often quote is how genetic epidemiology evolved. It used to be a discipline, when at least I started getting involved in the late 20th century, that was highly irreproducible. Almost every single genetic association that we would report turned out to be a false positive or not what we thought. The field did realize that and it did take steps, including conducting very large studies with replication as a sine qua non, with very stringent statistical rules, with a lot of transparency, with data sharing, large scale collaborations of multiple teams, and with a more broad approach to testing the genome instead of just testing one hypothesis or a few hypotheses at a time and then cherry-picking results to be reported. The field went from probably credibility of 1% or so to credibility of 99% when the proper methods and the proper process was applied. You know, one might question whether these 99% credible results are eventually also useful. That’s probably a different chapter and it’s not really up to the field, you know, a field has to discover what is to be discovered and we can decide what kind of use we can make of that knowledge and whether we can save lives or communities when it comes to public health. Sometimes we may, sometimes we may not, even if we have solid knowledge. There are several fields and there are several scientific disciplines that made progress. And somehow there has been more cross communication to adopt research practices from fields that have successfully used some of these, let’s say, helpful practices. Even within the same field, For example, you will see heterogeneity though in terms of adoption of helpful practices. For example, economics has a long tradition of asking for sharing of our data, at least in the top five journals, which means that people can see what is the data set that supports the analysis that is being presented. But there’s many other fields or subfields within economics and journals that don’t do that. There’s lot of variability in terms of whether people register their particles when they have specific ideas that they want to pursue, not talking about entirely exploratory research.
There’s very wide variability on whether they’re transparent about the statistical methods that they use and one can document the sharing of algorithms and codes that have been employed. And even that is evolving over time because with AI and with new tools, new technologies, the standards of what it means to improve are continuously being revisited because it’s new technologies and new tools and new applications that are at stake and improving them may be sometimes unknown what it means until we recognize what these technologies do both in individual studies and at scale.
[Sandro Galea]
Let me broaden the lens a little bit. Let’s move to science a little bit more broadly now. I want to move to the pandemic. You’ve published some papers in the pandemic, some of which were controversial. Talk us through, from your perspective, what did the pandemic reveal about the strengths and weaknesses of contemporary science?
[John Ioannidis]
I think the pandemic was a crash test of science. There was an acute crisis, a major crisis. Science had to respond and it did respond.
Scientists really tried to do their best in terms of producing information, publishing, disseminating. We’ve seen something in the range of one million papers that were published related to COVID-19. And the number of scientists involved in these papers exceed two million scientists. So people from disciplines that were, let’s say, more relevant or considered to be relevant obviously led that effort, but we had scientists from every single field of science, even the most remotely connected one can think of, connection to a pandemic, scientists did respond. Many scientists moved out of their comfort zone. I think when it comes to epidemiology, I saw epidemiology done by every different discipline that one could imagine. And of course, people are welcome. Sometimes you see some very fancy new approaches that are out of the box and no one has thought before within the discipline, but we also saw a lot of of transgression of boundaries in a way that probably the research was very poor and the quality suffered. So the vast majority of these papers showed the weakness of science as it showed the strength of the ability to respond. It showed the weakness of, really a lot of low quality, very erroneous, very poorly done and poorly reported research was circulating. And some of that research also had consequences because it was used either for policy or it was uptaken by different sectors and groups of the population and sometimes it was completely wrong.
We saw some of the rapid response being excellent. We saw randomized trials within three months with definitive results, as definitive as it can be on interventions at a large scale with platforms that were very well designed, tell us that these are effective treatments and these are not effective treatments.
We also saw a lot of noise, saw that research that can be done quickly dominated the fray of policymaking. Modeling, which I do love, became very powerful and models can help but they’re pretty weak sometimes in terms of certainty and in terms of how the assumptions that they make can really translate into reality. So I think that we saw both the good face and the bad face of science. We saw all the problems that had been documented before in a magnifying glass in a sense.
And both good and bad signs coexisted at the same time. I don’t think that this is surprising. I think in a good day one might say, well, we did our best. On a bad day one could say, well, that was a disaster. And probably both perspectives would have some rationale behind them.
[Sandro Galea]
Well, let’s tackle that for a second. how do we, you know, your critiques can be taken as critiques of science, which I think they are. I take them as critiques aimed to make science better. But of course, one can take them another way, which is to say, well, look, if Professor Ioannidis’ paper says that most findings we cannot replicate and then during a time like COVID, we have a lot of papers published that are perhaps not so good, maybe we really shouldn’t be funding breadth of science, that we should be limiting our science to things we can control, like quantum computing. So how do you respond to that? How do you, as someone who has often critiqued science, but who has spent a career in science, respond to making sure that your critiques help strengthen science, while at the same time being cognizant of where we can do better?
[John Ioannidis]
I think that the opportunity to be wrong may be one of the major strengths, if not the major strength of science. Other approaches like religion or politics or ideology, they cannot be wrong. They cannot be at fault. They cannot be inferior by definition. So science by definition tries to correct itself. It has a self-correcting mechanism. So personally, I celebrate when I realize that I have made an error that I can correct. And I see critics of my work as my benefactors. They’re the scientists who I really admire and I want to thank them for raising something that can be improved and that can be corrected. So this is not incommensurate with what science is and what the scientific method is and how it should operate. I think it’s not a proper way to try to convince people that science is omnipotent and infallible and one has to believe everything that is in the scientific literature because that narrative immediately collapses by the fact that the scientific literature is often very inconsistent with its own self, with evolving debates, with the results that get refuted, with results that get completely devastated by valid criticism or by subsequent studies that are better done and more properly interpreted. So I don’t see that as a weakness of science. I see it as a need to improve the communication of what science really is. And of course there may be technologies that may be more fancy. You mentioned quantum computing. This is still within science. I don’t see it as something that is outside science, all the new tools of artificial intelligence. They’re new tools that operate or should operate within science. The danger that I see is to perceive that these tools somehow have some immunity from the limitations or from the boundaries or the rules of the scientific method, that somehow they’re so powerful on a computational base that you cannot challenge them. I think that that would be a bad idea because organized skepticism is at the heart of the scientific method. So any new tool, no matter how powerful, no matter how computationally fancy, no matter how much it is touted, it should be subject to criticism and it can often be wrong. I don’t think that we have any evidence to suggest that more complex and more sophisticated approaches and more difficult approaches are necessarily better. And sometimes they may be, but other times not.
[Sandro Galea]
So I really like your term, organized skepticism. I feel like if we kept that in mind, we would do much better in our capacity to handle critiques from within science to strengthen science. In terms of emergencies, things that are happening quickly like COVID, what are your thoughts now, perhaps looking back on balancing the need for rapid policy-relevant science against the slower, more rigorous practices that we actually want to have in science?
[John Ioannidis]
I think that science should inform policy, but policy is a very complex decision-making process that involves many other things beyond science. It involves values, involves priorities, it involves considerations of multiple dimensions, it involves consultation with who is affected and who is running the policy and for what reason. Decisions need to be made and not making a decision is also a decision in a way, and both action and inaction are policies.
Policy will always be there, good policy or bad policy. I think the best that we can hope is that there is an open channel of communication with science, that people try to use science with all its certainty and uncertainty to accommodate the certainty and uncertainty of scientific findings into their policymaking. Try to be transparent about what we know and what we don’t know, why policies are shaped could be science but it could be other factors beyond science and importantly not try to subject science to some sort of normality test or of course censorship or fitting to what the policy wants to hear because otherwise there may be a sense that it is being challenged. Science should be free to do its job and policy should be ready to absorb new science, different science, evolving science, and change decisions if new science, along with the other factors that shape policy, suggest that the policy needs to be changed in order to achieve a better outcome. I don’t think we’re good at that. I think the pandemic showed that somehow policy wanted to subvert science in many situations.
Not only will we make a decision, but we will not allow any science that may challenge that decision or may lead us to reconsider and perhaps edit or revise or completely revert our decision. And I think that that was a mistake. Somehow it was felt that science had to fall in line, that it was like a military environment, like war, and in war science has to follow orders and just do what the policymakers want to do and want to hear and want to see.
I don’t think that’s a good recipe. Sometimes it may work if we know very well and the policy has been based on what we know very well and there’s little uncertainty. But many things about the pandemic had tremendous uncertainty. It was a new virus, a new set of circumstances. Many things were very uncertain at the beginning. So I think that leaving science alone to do its job and policy to do its job but have respect for each other. I think would be the best way to go.
[Sandro Galea]
Let me ask a very big prognostic question, try to be efficient about it though. Let’s talk about just quickly AI and large language models. What are your thoughts today? I realize your thoughts may evolve a year from now, two years from now, but today in 2026, about what AI, generative AI, large language models are going to do to some of the challenges of science. We started by talking reproducibility, but any number of challenges.
[John Ioannidis]
I think AI and LLMs can both improve science and deteriorate science.
It depends on how they’re being used. They’re just another set of tools, perhaps more powerful, more computationally advanced compared to what we had in the past. They can expedite and accelerate many of the processes that we have to use to produce scientific work. They can replace some of the scientific processes, perhaps even some of the scientific personnel. I mean we have the ability currently to have AI agents act as researchers. And of course they can act as reviewers, they can act as authors unfortunately, although they don’t have accountability, and they can act in many different ways. It’s up to us how we use these tools. If we use them constructively to do things faster, with better validation, with better accuracy, with the ability to test more things in a more efficient way and choose the best, the faster, compared to what we had as options in the past. I think that that’s very good news. At the same time if we use these tools just to generate fake papers in massive scale, which is happening at the moment, we have estimates that the number of fake papers produced and submitted, and submitted means that probably also they get published somehow, somewhere sooner or later, is escalating very rapidly. That would be bad news. We see a lot of fake papers, we see a lot of salami-slicing of papers that are irrelevant, tiny publication units.
We see some diminishing returns in terms of disruptive innovation. Science is more massive but less disruptive and I think that there’s a risk that these tools massively employed without emphasis on innovation may make things even worse in that regard. It’s an open game. I don’t think that we know the outcome yet because it’s really up to us how we use these tools and also what is the environment that encourages the use or the misuse of these tools. I do worry science is really shifting out of the scientific literature with these tools.
I mean, we have complained that, goodness, you just publish papers but don’t really change things. Well, there’s something good about publishing papers, documenting work, being transparent, being thorough, reporting, sharing, communicating.
Most of the most powerful computational organizations at the moment that use such technologies are no longer universities, academic centers, research centers that are used to publishing papers and disseminating knowledge through these traditional ways. They’re organizations and companies that are probably not that much interested in publishing. They just want to find things that would improve their profit margins and therefore many advances may just remain in file drawers. Of course, they’re going to be very big file drawers to accommodate all the bytes of discoveries. And this is clearly a risk if we’re dealing with a new environment of science that is mostly stealth, secretive, privatized, not shared in a way that science no longer can communicate to a wider public, to a scientific community that should be sharing, we have been saying that all along, willing to understand. What is the current evidence in order to go to the next step, what would be the next discovery? Discoveries would be hidden from us.
[Sandro Galea]
I agree actually about this challenge about science not happening in scientific journals and the loss that comes with that. Let’s chalk that up as one more threat to science. But if you were to look ahead, let’s look at 20 to 30 years and thinking about some of the critiques you’ve put in the field, thinking about replication crisis, what would success for science look like in the next 20 to 30 years? What would science look like 20, 30 years from now if we get things right?
[John Ioannidis]
This is a very daring prediction and I’m sure that my margin of error exceeds 100%. I don’t think that anyone knows. Science has surprised us in positive ways and sometimes in negative ways again and again. I tried to make predictions and I also tried to plan for my own work with let’s say five year horizons and I’m always wrong. I always have to revisit what I had planned because things have moved on. There’s new opportunities, there’s new technologies, there’s new tools, there’s refutations, there’s replications that suggest that we need to change course. So this is very unpredictable, but I think that science has a very long track record of surprises. So continuing to surprise us in a positive way, having things in 20 years that right now, we cannot even imagine. That I cannot even tell you one out of the top 10 things that would be the great achievements of science in 20 years, I think that would be a success story. I think a failure story would be 20 years or 30 years from now producing things that we already know, to just continue publishing massively papers that are either fraudulent or unreliable or low quality or just not really disruptive innovation and not really replicable reproducible research and just making another interview like this one with the same problems still being with us with really no resolution. I hope that that’s not going to be the case. I’m cautiously optimistic because science has proven again and again that it can move forward, it can make a difference, it can advance. And why not? Should do that in the future as well.
[Sandro Galea]
I like the notion that the success in 20, 30 years is that we’re tackling things that we don’t know what they’re going to be. That’s terrific. Now, let me just ask you a couple more questions, maybe a little bit more personal now, but a slightly different track. you know, in your career, you’ve sometimes found yourself at the center of some controversies. You may face pushback for things you’ve said, work you’ve done. You know, how do you respond to such moments? How can scientists in general do a better job of doing research and articulating views that may find them in such situations?
[John Ioannidis]
As I have often said, I think that critics are our best benefactors when we find ourselves immersed into some sort of debate or even more than debate, because some of these debates in the past have been even personal and have involved threats and even life threats to me and my family. I think we should try to remain calm, remember why we’re doing science, why we’re trying to serve people, why we’re trying to serve our communities, why we try to improve patients as physicians, for example, and try to understand why that debate arises, what might be the other points that have been raised.
Try to make a strong case for the arguments of our opponents. Give the best shot to our opponents to express their arguments. I think that this is the best way. If they’re right, then that would be wonderful. We realize that they’re right and we correct ourselves and our work and we move forward. If they’re wrong, the best way to show that they’re wrong is to give them the podium for a maximum time and to give them all the opportunity to show that they’re wrong. So I think that we need some tolerance, we need some mutual respect, we need some constant faith, in a way, that science will find a way to move forward and try to move beyond the personal ordeal and the personal stress and the personal, sometimes frustrating circumstances.
[Sandro Galea]
Well, that seems like a good prescription, I agree. Another question that’s perhaps also personal, because I read your bio on your official Stanford website. And these bios are, there’s a template for them. But you have a sentence which really I thought was unusual. So here’s what you wrote. You wrote, I consider myself privileged to have learned and continue to learn from interactions with students and young scientists of all ages from all over the world. And I love to be constantly reminded that I know next to nothing. So I read this and I thought this was really a lovely note of open-minded humility that could serve as a guiding spirit for science. Now, how do we as scientists, experts in our areas, get better and be better at remembering that quote, use your words, we know next to nothing and centering this humility in our work?
[John Ioannidis]
I think that’s not easy. I don’t consider myself to be humble. I think that I’m very arrogant much of the time and I just have to slap myself in the face to remind me that arrogance does not reach very far in science or in anything. And I think that for all of us, just remembering how little we know, just comparing what we knew in the recent past and how much our world has changed, how much information, knowledge, perhaps s little bit of wisdom has accumulated through a communal effort of millions of people who do science. I think that we need that reminder again and again. The best that can happen to us is at some point to realize that we know even less than next to nothing. That means that we have opened new avenues for asking new questions that we hadn’t even imagined before, for testing new ground, for really coming up with discoveries that were unimaginable.
So that’s really the best that can happen to us in the end.
[Sandro Galea]
Humility is a strange thing, right? Once you think you have it, you’ve lost it. Last question, John. What gives you hope in the moment?
[John Ioannidis]
I think seeing a lot of young scientists who are very enthusiastic, who have brilliant ideas, who can keep me awake when I’m falling asleep in my own theories and dogmas and challenge me. I think this is really wonderful. And young scientists, I have to say, it’s not just an age thing. I think that some people who are chronologically old can still be young at heart and young at mind and willing to challenge primarily our own selves.
[Sandro Galea]
I’m Sandro Galea. I have been talking with Professor John Ioannidis about the future of science. Thank you, John, for joining today.
[John Ioannidis]
Thank you so much, Sandro.
[Sandro Galea]
Thank you to everybody who has joined us for Ideas Matter. I look forward to continuing the conversation.