NLG vs Templates: Levels of Sophistication in Generating Text

In 1995, I published a workshop paper on “NLG vs Templates“; this went on to become one of my 10 most cited papers, even though it was just a workshop paper.   21 years later, I still often get asked about what the difference is between NLG and templates, and how NLG is better than templates.

As a scientist, my first reaction is that its impossible to answer the question because the terms are not well defined.  Ie, I feel a bit like the linguist Anthony Woodbury; when he was asked how many words Eskimos have for snow, he replied that the answer depended on how you defined “Eskimo”, “word”, and “snow”.  Likewise, the difference between “NLG” and “template” depends on how these terms are defined.

So rather than talk about “NLG” vs “templates”, I will talk about different levels of sophistication in generating texts.  These levels are still of course vague and have fuzzy boundaries, but I think this is a step forward from comparing “NLG” and “templates”.

Level 1: Simple Fill-In-The-Blank Systems

Level 1 systems are basic fill-in-the blank template systems.  A good example is MS Word mailmerge, which essentially allows Word documents to include gaps, which are filled in using data retrieved from a spreadsheet row, database table entry, etc.  There is some support for varying how the data is expressed (eg, whether numbers are spelled out) and for conditionals (eg, including extra content if the data meets a constraint), but its pretty limited.

Word mailmerge is absolutely fine for many applications, but it gets frustrating fast if you are trying to do something complex.

Level 2: Scripts or Rules Producing Text

Level 2 systems add general-purpose programming constructs to Level 1 systems.  This is usually done via either a scripting language or by using business rules.  The scripting approach, such as web templating languages, essentially  embeds the template inside a general-purpose scripting language, which supports complex conditionals, loops, access to code libraries, etc.  Business rule systems, including most document composition tools, take a similar approach, but focus on writing business rules rather than scripts.

Adding general-purpose programming to a template language certainly makes it much more powerful and useful, and this is a sensible approach in many contexts.  But the lack of any linguistic capabilities makes it difficult to build systems that reliably generate complex high-quality texts.

Level 3: Word-Level Grammatical Functions

Level 3 systems add word-level grammatical functions to level 2 systems, which deal with things like morphology (eg, plural of child is children, not childs), morphophonology (eg, choosing between a or an), and orthography (eg,   one “.” instead of two “..” at the end of I like Washington D.C.).   These functions make it significantly easier to generate grammatically correct texts, and ease the pain of writing complex template systems.

Functions like pluraliseNoun are certainly useful for grammatical correctness. However, they don’t help with other aspects of writing.

Level 4: Dynamically Creating Sentences

Level 4 systems dynamically create sentences (and perhaps paragraphs) from representations of the meaning to be conveyed by the sentence and/or its desired linguistic structure.  Dynamically creating sentences in this fashion means the system can do sensible things in unusual (edge) cases, without needing the developer to explicitly write code for every edge (boundary) case.  It also allows the system to linguistically “optimise” sentences in a number of ways, including reference, aggregration, ordering, and connectives.  For example, producing    John was hungry, so he ate an apple.  He was also cold.   instead of   John was hungry.  John was cold.  John ate an apple.

Level 4 systems do an excellent job of producing high-quality sentences (and paragraphs), that is “micro-level” writing.  But more is needed to do an excellent job at “macro-level” writing.

Level 5: Dynamically Creating Documents

Level 5 systems add intelligence to the “macro-writing” task, that is to the task of producing a document which is relevant and useful to its readers, and also well-structured (for example as a narrative).  How this is done depends on the goal of the text.  For example, a text that is intended to be persuasive may be based on models of argumentation and behaviour change; while a text that summarises data for decision support may be based on an analysis of key factors that influence the decision, plus models of narrative and human decision-making.

Level 5 systems do an excellent job of producing good quality narratives.  Of course there is still room for improvement, and no doubt other levels could be added to this list!

“NLG” vs “Templates” Again

I think everyone would agree that Level 1 (simple fill-in-the-bank) is”templates” and Level 5 (dynamically creating documents) is “NLG”.  But I dont think there is agreement beyond this.  Certainly many companies with a Level 3 offering claim to do NLG, and I suspect a few companies with Level 2 offerings also claim to do NLG.  On the other hand, I have academic colleagues who only consider Level 5 to be “real” NLG.

As for me, I would prefer to avoid talking about “NLG” vs “templates”, and instead discuss systems, algorithments, toolkits, etc in terms of which of the above levels they support.


3 thoughts on “NLG vs Templates: Levels of Sophistication in Generating Text

  1. Thanks for your ideas. I think that it is necessary to think which subtasks the template approach tackles: you can have a simple mapping between inputs and templates and just fill slots; or you can have templates for sentences and use other solutions for Content Determination and Sentence Aggregation subtasks.
    So, template based approaches are those ones that just uses template in any way, or those that naively fill templates with input? I think that when the NLG x Template-based discussion arises, it’s mainly about template as an end-to-end solution to automatic generation of text, and not about the use of templates in some of the subtasks.
    I like templates because of their simplicity and interpretability, which is important when we work with data driven approaches.
    The NLG x Template-based discussion seems like the discussion over machine learning models and their complexity – with more complexity you lose interpretability and runs more risk of overfitting.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s