I recently gave a SIGGEN webinar on the history of NLG (YouTube link). At the end of the talk, I decided to add a slide on my vision for the future, and basically said that I hoped NLG could help people (general public as well as professionals) understand complex data and AI reasoning. “Humanising” data and AI reasoning makes them much more useful to us, and also means that we are in control.
Problem: Too much data and incomprehensible AI
One of the defining features of the modern world is the huge (and exponentially increasing) amount of data which is available. This data has the potential to transform our lives, by giving us a much better understanding of our world and the impact of our actions. But the flood of data means that we cannot pore over individual elements, we need tools to help us understand the data and draw insights from it. These include statistical tools, visualisations, and (more recently) machine learning algorithms.
But the tools that we have are not good enough, especially for ordinary people (general public) who may not understand statistics or data visualisations; they can blindly follow instructions from an AI ML system, but this makes them the servant, not the master, of the AI. This is not acceptable in a democratic society which is increasingly driven by data and AI, and where we have major concerns about racial and other biases in datasets and AI reasoning. Professionals are also struggling; for example I have talked to doctors and engineers who say that it takes them much longer to make decisions (and decision quality is not much better) because they feel they have to examine all of the data sources which are available to them.
In short, we need to “humanise” and “democratise” data and AI, so that people (including the general public) can understand data and AI; this is the only way to ensure that we remain the masters, not the servants. And I think NLG can play a really important role here! We can use language to communicate insights and understanding about data and reasoning which are difficult to convey visually, especially to the general public (many of whom struggle to understand graphics more complex than a simple bar chart). And we can also use language to communicate background, context, caveats, etc, which will help people make best use of the data and insights.
Challenges for NLG
So my vision for NLG is that it will democratise and humanise data and AI reasoning, by providing accessible summaries, explanations, etc of the data and/or reasoning, including summaries which are accessible to the general public. Indeed one of the biggest commercial successes of NLG in 2021 is in business intelligence, where it is used to summarise and give insights about business data (eg, sales and profits). This is a great start, but we need to be able to summarise (etc) much more complex data sets than is possible in 2021, and to do this in a way which makes sense to the general public.
There are huge challenges in doing this, including
- Finding key insights in data
- Presenting them as a story
- Sensitive to user goals, knowledge, emotions
- Better science: Evaluation, hypothesis-test, etc
Of course there are many other challenges as well, above is just a few of the many challenges in achieving this vision!
In NLG, as in so many other things, “content is king”. NLG summaries and explanations of data and reasoning must contain useful and accurate insights and information. If they dont, they are not useful, no matter how well they are written.
So our data-to-text systems must include analytics and content selection modules which produce valuable insights and information which can be expressed in language (and perhaps cannot be expressed well in visualisations). Much of this logic will be domain-specific, but I believe there are general principles here as well, including about what kind of information is best expressed in words instead of graphs.
I would love to see more research on “articulate analytics” and “textual vs visual presentation of information”. Unfortunately, I am seeing less of this in the NLG community than I did ten years ago, in part because so many researchers are focusing on simple data sets for which content selection is not an issue.
Presenting insights as a story
People want information presented as stories or narratives; hence our NLG texts will be much more effective if they can present insights and information as a story rather than as a list of bullet points.
Unfortunately, we dont currently have a good model or understanding of how to generate stories or narratives from data. We have some interesting ideas, but are a long way from being able to build systems which can reliably and accurately create stories around insights. Especially as most of the research I have seen on narrative generation focuses on generating fictional stories, not narratives which communicate key insights about data and reasoning.
This is another topic that I would love to see more research on!
Sensitivity to users
Our NLG texts will be much be more effective if they are tailored for the reader. This is especially important when communicating to the “general public”, which includes people with a huge range of abilities and backgrounds. For example, when we worked on generating reports for parents of babies in neonatal ICU many years ago, we had to try to cater to parents who ranged from medical professionals to teenagers who had left school at 16.
I am becoming increasingly convinced that we need to tailor texts to reader’s emotional state as well as abilities and knowledge. Certainly parents of babies in neonatal ICU were usually highly stressed, and this affected how we interacted with them. One of my PhD students, Simone Balloccu, has recently started looking at stress-based tailoring, and I think we need to see more research in this area. Of course these dimensions interact, eg stress can reduce a person’s ability to comprehend complex texts.
There is of course a lot of research on user modelling, adaptive user interfaces, and so forth, but to date this has played only a minor role in NLG. I think this needs to change.
Last but not least, if we are going to use advanced NLG in real-world settings to improve the world, then we (research community) need to be better scientists! In particular, we need to do careful experiments with clear hypotheses and meaningful evaluations, which can be replicated by others. Otherwise, our findings may be meaningless.
I think the AI, NLP, and NLG communities have come a long way in this regard. When I started my PhD in 1985, a lot of AI PhD’s were “gee whiz” projects which demonstrated very sophisticated functionality on a handful of examples, without any evaluation other than “isnt this amazing”. We’ve come a long way since 1985, for example almost all NLP papers in 2021 include substantial evaluations.
However, in all honesty I think there is still quite a ways to go before our scientific standards (rigour, reproducibility, etc) are comparable to those in medicine, biology, physics, etc. In particular, I am frustrated by the fact that so many researchers in NLG continue to use BLEU to evaluate their systems, despite all of the evidence that BLEU evaluations are meaningless in NLG. I hope our culture evolves so that use of an evaluation metric which is known to be meaningless is no longer acceptable in our community.
Of course there are many other challenges in achieving the vision of using NLG to humanise and democratise data and AI reasoning! For example, such systems probably need to be interactive, and we dont have a good understanding of how to make complex NLG systems interactive.
But in any case, if we can achieve this vision, then I think NLG will make a real difference to society!