One of the challenges in data-to-text NLG is creating good summaries and insights when the input is flawed (incomplete, incorrect, or inconsistent). One of my PhD students has been working on this problem, and it is a hard one! But a good solution would be hugely valuable for society. I may be able to offer a PhD studentship in this area, contact me if interested.
I’m excited by the potential of adding conversational capabilities to data-to-text systems, so that users can provide context, ask follow-up questions, etc. I think this is essential to my vision of using NLG to humanise data and AI!
I teach an MSc course on Evaluating AI. which several people have asked me about. In this blog I give an overview of what is in the course. Hopefully this will be useful to people who are interested in learning about (or teaching) evaluation.
Salesforce has announced that it is buying the NLG company Narrative Science, which will become part of the Tableau team which provides business intelligence tools. This highlights that NLG is being taken very seriously in the business intelligence world, and indeed BI looks like it could be a “killer app” for NLG.
The real world usefulness of NLG systems depends on many different factors, not just accuracy and fluency of generate texts. We should evaluate real-world utility of our systems, and check how well existing evaluation techniques (metrics and Turker-based human evaluation) correlate with real-world utility.
The fundamental challenges of building useful data-to-text NLG systems are the same regardless of whether we build systems with rules or transformers. We need to understand where NLG is useful, choose good content to communicate, robustly deal with edge cases, allow users to configure and control the system, and evaluate properly. I’d like to see more research on these fundamental issues, regardless of technology used.
(Personal blog) I’m 61, so I’m starting to think about retirement. I’m not planning to retire until 2025 at the earliest, but that’s close enough that its starting to have an impact on my commitments and activities,
When I asked participants what they most liked at the recent INLG conference, people highlighted events and sessions which focused on discussion and interaction, not technical research papers. Perhaps there is a lesson here that conferences should focus more on interaction and community, and not simply be regarded as venues for presenting research papers.
One of the highlights of INLG for me was the panel on “What users want from real world NLG”. I summarise a *few* of the really interesting points made about trust, authoring, configurability, human-in-loop, and other key issues for real-world NLG users.
Anya Belz and I are looking for a research fellow to work on a new project on reproducibility of human evaluations of NLP systems. This is a great opportunity for a researcher who wants to improve the scientific quality of human evaluations in NLP!