other

Why is adoption of AI in healthcare so slow?

In June 2024 we had a one-day workshop in Aberdeen on AI in healthcare, which included clinicians, vendors, and health managers as well as researchers. One of the paradoxes I saw was that although the research presented was exciting and showed that AI could provide real clinical benefits (BBC News), this technology is not being used.

For example, one speaker said that in 2022, the NHS in Scotland had a total of 5 AI applications deployed and in production usage; and in 2024 nothing had changed, the same 5 applications were still being used but no new AI applications had entered production usage. Hardly a stampede towards using AI…

So why is adoption of AI technology so slow in healthcare? I have some thoughts below, based largely on discussions I had in the workshop; this builds on comments I made in an earlier blog. I suspect much of what I say will be obvious to some of my readers, but it may less obvious to others

Proper evaluation

One barrier to adoption that was mentioned in the workshop is lack of solid evidence of effectiveness. The AI community emphases evaluations based on held-out test sets, but the medical community does not take these seriously. And rightly so in my opinion. While a careful and rigorous test-set evaluation can tell us something about how well an AI system works, there are a huge number of published evaluations (including in “top” venues) which are highly problematical. For example they ignore data contamination issues, or they use test data more then once, perhaps by repeatedly tweaking models/systems to get better performance on a test set. Of course there are also some high-quality evals, but I can understand an outsider deciding not to trust any test-set evaluation published in the AI/NLP literature.

What the NHS and other health organisations want to see is evaluations which show real-world impact on patient health or other important outcomes, preferably measured using a randomised controlled trial (RCT). Fortunately, we are now seeing RCTs that show real-world benefits from medical AI, which is encouraging and hopefully will lead to more adoption of AI. I encourage AI researchers who want their tech to actually be used to run RCTs or otherwise show real-world impact.

Cost-benefit and business case

Another topic that came up in the workshop was that there needs to be a “business case” for using AI. In other words, the “return on investment” on building or buying an AI solution needs to be higher than, say, using this money to hire more nurses, or indeed update obsolete IT hardware and software.

For example, my student Francesco Moramarco worked on an NLP system which created a draft summary of doctor-patient consultations (which doctors had to check and edit). Francesco measured how much time doctors saved by using the system when it was deployed in real-world usage (blog), and found that doctors who used the system reduced summary-writing time by 10%, which is not a lot. I don’t have access to detailed costing data, but I suspect that it would be hard to build a good business case which justified the expense of building, running, supporting, and maintaining a complex NLP system whose main benefit was making doctors 10% more efficient at writing one type of summary.

A health manager at the workshop made the related comment that it would be easier to justify AI systems if they did more. Eg, current AI models for interpreting scans and other medical images focus on detecting specific problems, which means that the NHS would need to deploy dozens (hundreds?) of AI models to get decent coverage of scan interpretation. Which is not going to happen, this would be too costly from an IT, regulatory, and support perspective. If someone could bundle the models together into a single AI system, this would be much more attractive.

Address real needs

Perhaps most fundamentally, clinicians and others at the workshop expressed concern that AI research was not focusing on developing solutions which made sense for the health service. For example, instead of building an AI tool to interpret scans (and essentially replace a radiologist), why not use AI to help human radiologists do a better job? This could provide more value to the NHS, and would be much easier to integrate into existing workflows. Indeed, one vendor commented to me that his company was focusing on using AI techniques to enhance images from their scanners in order to make it easier for radiologists to spot pathologies, because there was much more demand for this from their clients than for full-blown AI image interpretation.

A clinician at the workshop made this point very strongly to me, with several plausible suggestions about how relatively simple AI could really help him do a better job. Which raises the question of why the AI community seems fixated on replacing doctors instead of helping doctors do a better job…

A related point is that common AI/medicine use cases such as interpreting scans do not address the most critical challenges facing the NHS. Aggarwal et al 24 list ten “pressure points” for cancer treatment in UK NHS, and are skeptical about whether technology can help address these. For example, with regard to widening inequality in cancer care (part of their first “pressure point”), they say:

More generally, sociodemographic inequalities require social rather than technical fixes. A common fallacy of decision-makers is that technology-based tools can reverse inequalities. The reality is that technologies deeply modify interactions between patients and systems generating additional barriers for those with poor digital or health literacy. We caution against technocentric approaches without robust evaluation from an equity perspective. Furthermore, when resources are restricted, access to optimal, timely care depends heavily on patients’ negotiating power, which is lower for socially disadvantaged people, and is reflected in lower rates of second opinions or travel to alternative, more distant centres for better or quicker care.

Final Thoughts

Talking to clinicians, health managers, and vendors at the Aberdeen workshop was an eye-opening experience for me, and brought home that if the AI community really wants to make a difference in healthcare, it needs to change what it does. Most fundamentally, it needs to understand the requirements of the health sector (including evaluation, challenges, and business cases, as well as things I have not mentioned here such as regulatory approval), and develop solutions based on these requirements. Of course there are many researchers who are already doing this. Unfortunately there are also many researchers who don’t have a clue about what the health sector needs, and are happy to push out large numbers of papers of dubious medical value which tweak models to do better on test sets for problems which have low business value and/or don’t address the sectors fundamental challenges.

And I suspect the problem is much wider than medicine, and the same thing is happening in other application areas…

4 thoughts on “Why is adoption of AI in healthcare so slow?

Leave a reply to dirkhk Cancel reply