I recently gave some internal talks at Arria about the history of NLG. Because this was for Arria, I talked about ground-breaking commercial products, especially FoG (Goldberg et al 1994), which was the first-commercial NLG system to be deployed operationally, and Kondadadi et al 2013, which is the first ML-based NLG system I am aware of which was operationally deployed.
Anyways, I wanted to tell my audience what happened with these systems after they were deployed. I don’t have complete information, but I believe that both of these systems fell out of operational use after 3-4 years, and this was partially because they were hard to configure, tweak, or otherwise modify. FoG generated weather forecasts for the what is now called Environment Canada. Forecasters could of course edit individual forecasts before they were released, but they had very little ability to configure, control, or tweak the software to change the texts it produced; such changes could only be made by the software developers. Kondadadi et al generated financial news stories for Thomson Reuters, and I believe the situation was similar; journalists could post-edit individual stories, but they didn’t have much ability to configure or control the software to change what it produced.
I got a lot of “heads nodding” when I presented this at Arria. Most people who use using NLG (especially if they are professionals rather than general public) want to be able to customise/configure NLG systems so that the systems produce texts which they fully approve of. In some contexts post-editing is possible, but in such case users still want to be able to adjust the software to produce something which is close to what’s needed, in order to reduce the amount of post-editing which is required. If users cannot do this, they may reject the NLG system and switch to simple templates, which they are able to customise and configure.
So customisation and configuration is very importance in real-world NLG. However I’m aware of very little research or even discussion about this topic in the academic community, which is a shame.
For rule-based NLG, in principle we can make systems configurable by adding lots of configuration options, and then modifying the rules to take these options into account. There are real challenges in knowing what users will want to configure, designing a good configuration “user experience”, and quality assurance (ie, making sure that high-quality texts are generated for all possible configuration options). But at least the outline of the challenges are clear.
I suspect it is harder to configure ML-based NLG systems, and I’m not aware of much work (commercial or academic) on this topic. I guess in principle configuration options could be another input in the training set, but this could drastically increase the amount of training data needed. After all, even a modest number of configuration parameters could lead to a thousand (or even a million) variants of a text.
What if users want to make changes to texts which developers did not anticipate? For example, assume that a journalist wants to modify an NLG system which produces short financial stories about companies (similar to Kondadadi et al), so that these stories also say if the company is in a sector vulnerable to Covid. For example, by adding sentences such as
Trendy Hotels Inc is in the hospitality sector, which has been hit hard by Covid.
Lets assume that the NLG system doesnt currently do this (because it was created in pre-Covid days). Also assume that the logic used to produce the above sentence is simple, perhaps just looking up the sector of the company, and then checking against a fixed list of vulnerable sectors. Could a journalist update the NLG system without having to involve the developers?
For rule-based NLG, I have seen attempts at the above based on either (A) special tools for letting non-developers edit rules or (B) letting users type in model sentences and then analysing the result. However, I dont think any of the approaches I’ve seen to date (at least in deployed systems) have worked well; the tools (A) in practice often still require a developer “mindset” and the parsers (B) struggle to incorporate business logic. But I’m sure there are a lot of things I havent seen, and I look forward to finding out about innovative and creative ideas!
I dont have a clue about how to even attempt to do this in ML or neural NLG system. Maybe we could update the training data to include the new content?? However this could be an enormous amount of work!
Configuring and customising NLG systems is very important, and there are is a lot of scope to do this better! I know people are working on this in commercial companies, I’d love to see more academics thinking about this as well.