other

We can learn from the past in AI/Medicine

May 6, 2024May 6, 2024 ehudreiter1 Comment

People working in AI in Medicine (and indeed AI more generally) should be aware of the long history of previous work in this area. Our technology is much better in 2024, but real-world success is still challenging, as has been the case for the past 70 years (the first claims that models could be better than doctors were made in 1954).

other

Real-world usage of LLMs in Journalism

Apr 23, 2024 ehudreiterLeave a comment

I really liked a recent survey of gen AI in journalism, which looks at issues such as how journalists use/interact with LLMs, and what impact this has on journalists. Some unexpected (to me) findings, for example the most common ethical concern is that news organisations will use LLMs without human supervision.

evaluation

Ten tips on doing a good evaluation

Apr 8, 2024 ehudreiterLeave a comment

I present some suggestions for doing good evaluations, which are based on previous blogs I have written.

other

Explaining complex information to patients

Mar 25, 2024Mar 25, 2024 ehudreiter4 Comments

A really important and interesting research challenge is how to effectively communicate complex information to patients. At Aberdeen we are working on this topic in several areas of medicine, and are looking for a research fellow to join us.

evaluation

I’m very worried about data contamination

Mar 12, 2024Mar 13, 2024 ehudreiter2 Comments

Data contamination (testing and evaluating LLMs using test data which is known the the LLM) may be a huge problem in NLP, leading to a lot of invalid scientific claims. Unfortunately, many NLP researchers ignore the problem, which is really worrying.

other

In 2019 LM output was fluent but not trustworthy: still true in 2024

Feb 27, 2024Feb 27, 2024 ehudreiterLeave a comment

In 2019 I told students that neural language models produced texts which were fluent but could not be trusted content-wise. In 2024 I told them the same thing. My high-level message hasnt changed despite the huge improvements in tech, maybe this is a fundamental aspect of LLMs?

other

Communicating uncertainty to non-experts

Feb 13, 2024 ehudreiter1 Comment

Communicating uncertainty to non-experts is very important but also very difficult. Problems in communicating risk probabilities are well known, but additional challenges arise in many real-world use cases, including communicating time-series of probabilities and explaining the impact of features which the model ignores.

academics

Systematic Reviews in NLP

Jan 31, 2024Feb 7, 2024 ehudreiterLeave a comment

Systematic literature reviews are a powerful and useful methodology for investigating many research questions. I give a high-level overview for NLP researchers who are not familiar with this technique.

academics

Common Flaws in NLP Evaluation Experiments

Jan 15, 2024Jan 15, 2024 ehudreiter1 Comment

Our latest paper from the ReproHum project discusses experimental flaws we have encountered while reproducing earlier experiments, including code bugs, UI problems, inappropriate exclusion of data, reporting errors, and ethical lapses. Pretty depressing. These types of errors are not detected by usual NLP reviewing practices, so I suspect they may be pretty common…

other

Can LLMs make medicine safer?

Dec 28, 2023Jan 2, 2024 ehudreiter2 Comments

There is a lot of justified concern about the risks of using LLMs in healthcare. But LLMs can also make medicine safer, if they are used to support doctors and help them make fewer medical errors.

Ehud Reiter's Blog

Ehud's thoughts and observations about Natural Language Generation

Author: ehudreiter

We can learn from the past in AI/Medicine

Real-world usage of LLMs in Journalism

Ten tips on doing a good evaluation

Explaining complex information to patients

I’m very worried about data contamination

In 2019 LM output was fluent but not trustworthy: still true in 2024

Communicating uncertainty to non-experts

Systematic Reviews in NLP

Common Flaws in NLP Evaluation Experiments

Can LLMs make medicine safer?