building NLG systems

Using language models to improve rule/template NLG

I see growing interest in the applied/commercial data-to-text NLG community (ie, people who are trying to build real-world data-to-text solutions) in using ML models to improve the output of a rule or template based NLG system. In their simplest form, the models can be used to fix grammatical problems in generated texts, eg change “a apple” to “an apple”. More ambitiously, models can be used to rewrite texts in a more general way to make them more fluent. For example, Kale and Rastogi (2020) train a T5 language model to rewrite templatish texts into more fluent texts. An example from their paper (Figure 5) is

Input: Flights offer(airlines=American Airlines, outbound departure time=2:40 pm, is nonstop=True, price=$78)

Template output: Would you like to fly with American Airlines? The onward flight takes off at 2:40 pm. It is a direct flight. The ticket costs $78.

Template text improved by T5 model: How about an American Airlines flight that leaves at 2:40 pm? It’s a direct flight and costs $78.

I recently wrote a blog on Boring uses of language models, and I imagine that most academics would regard using an LM to improve a template text as pretty boring, and much less exciting than using the language model to generate text directly from the input, without any templates. However, from the perspective of people who are trying to build real-world data-to-text NLG systems, its hard to use “100% pure neural” NLG because (A) neural NLG systems get content wrong (hallucinations and omissions) and (B) neural NLG systems are hard to configure and control. But using “100% rule-based” NLG also has problems, in particular engineering the rules/templates to produce fluent narratives in all contexts (including edge cases) is a pain.

The advantage of using LMs to improve the output of a rule-based NLG system is that in this architecture all of the content decisions are made by the rules, so neural hallucination/omission is much less of a concern. Also developers and domain experts can edit the rules, so the system is configurable. But on the other hand, people building the system dont need to carefully engineer rules to guarantee fluency in all cases, because the language model takes care of this; this makes it much easier to write rules. So this seems like a good way to combine neural and rule technologies in order to play to the strengths of both.

The other nice thing about this approach is that of any potential problems are detected in the language model version of the text, the system can simply present the output of the rule/template system to the user.

Of course there are a lot of problems and challenges in this approach! In particular, if the neural LM gets too aggressive in its rewriting, I suspect there is a danger it will start introducing hallucinations and omissions. So it may be best to start with relatively simple/straightforward rewriting. But overall I do think this approach has potential for building robust real-world NLG systems, and is a good example of a “boring nut useful” way to use LMs.

One thought on “Using language models to improve rule/template NLG

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s