other

Questions from readers of my book

Mar 3, 2026Mar 3, 2026 ehudreiterLeave a comment

A group who is reading my book sent me many questions, some of which we discussed in a call last week. I thought I would share the questions and my responses.

evaluation

Dont ignore omissions!

Feb 11, 2026 ehudreiter1 Comment

Most semantic evaluation of LLMs focuses on accuracy and hallucination. These are very important, but it is also important to look at completeness and omission; does the generated text include all of the key information which the user needs to know? Omissions are a huge problem in medical NLG, and in other NLG tasks as well.

academics

My Eureka moments in research

Jan 30, 2026 ehudreiter1 Comment

The most exciting and rewarding moments of my research career were when I discovered something new and exciting about NLG, language, etc. I describe a few of these “Eureka” moments. I hope my readers also have them; when I reflect back on my career, these moments and insights are what I remember best, much more so than getting papers or proposals accepted.

AI in Healthcare

Lets use AI to help people manage illness

Jan 19, 2026 ehudreiter1 Comment

I am excited by the idea of using AI to help people manage ilness and health conditions. This isnt very sexy, but I think there is real potential to improve health outcomes and quality of life.

personal

Retirement Plans: Travel and some academics

Jan 6, 2026Jan 6, 2026 ehudreiterLeave a comment

I hope to retire soon, and many people are asking about my plans. Basically I want to do lots of travel, say involved in academia, and perhaps do some writing.

evaluation

Do a sanity check on your experiments

Dec 22, 2025Dec 22, 2025 ehudreiterLeave a comment

I strongly recommend that researchers do “sanity checks” on data, model outputs, and evaluation results, looking for anomalies. This can help detect data errors, model cheating, software bugs, and other flaws which distort experiments.

evaluation

Do LLMs cheat on benchmarks

Dec 8, 2025 ehudreiter1 Comment

LLMs often “cheat” on benchmarks via data contamination and reward hacking. Unfortunately, this problem seems to be getting worse, perhaps because of perverse incentives. If we want to genuinely and meaningfully evaluate LLMs, we need to move beyond benchmarks and start measuring real-world impact.

academics

Hard to Change Poor Research Culture

Nov 24, 2025 ehudreiter3 Comments

Research culture is very important but also very hard to change. I suspect this is one reason why it is so difficult to get people to do more rigorous and meaningful experiments.

building NLG systems

Understanding what users want from NLG

Nov 6, 2025 ehudreiterLeave a comment

When building an NLG system, it really helps to understand what users want; this came up several times at the recent INLG conference. I discuss some of our work in this space, and give a few suggestions.

AI in Healthcare

Most common uses of AI in Healthcare

Oct 21, 2025 ehudreiter1 Comment

I review some data on usage of AI in healthcare, and conclude that the most common uses in 2025 are probably (A) giving personalised health information to patients and (B) helping clinicians write documents. We’ve worked on these topics at Aberdeen, but most researchers focus on AI for decision support, which is not widely used.

Ehud Reiter's Blog

Ehud's thoughts about Natural Language Generation. Also see my book on NLG.

Author: ehudreiter

Questions from readers of my book

Dont ignore omissions!

My Eureka moments in research

Lets use AI to help people manage illness

Retirement Plans: Travel and some academics

Do a sanity check on your experiments

Do LLMs cheat on benchmarks

Hard to Change Poor Research Culture

Understanding what users want from NLG

Most common uses of AI in Healthcare