Skills Required to Use Different NLG Technologies

There are many ways of building NLG systems, and many discussions (including in my blog) about the relative merits of smart templating (eg, Arria Studio), rule-based programmatic approaches (eg, simplenlg), and machine learning.  Most of these discussions focus on output quality, robustness and reliability, programming effort, and availability of resources such as corpora.

Another important criteria, which I dont think has been discussed as much (it certainly is barely mentioned in the research literature) is the skills required to build systems using these different approaches.   Over the past few years, I’ve had a number of undergraduate and MSc students build NLG systems using the above approaches. Obviously I cannot say anything in a public forum about individual students!  But nonetheless, I think I see some patterns emerging.

Machine learning: You need a lot of specialist background to be able to build good-quality NLG systems using machine learning techniques (especially deep learning).  Of course it depends on the individual, but I suspect successful practitioners in this space tend to have several years of postgraduate training (MSc or PhD), plus excellent software development skills.  Indeed, none of my undergraduate or MSc students have had this level of training, and none of them (to date) have been able to build a high-quality ML NLG system in a project context, where they start from scratch (as opposed to “lab” settings where the students are given explicit scripts and instructions for building a system).

Note that this is different from using ML to build classifiers, which does not require this level of expertise, and is routinely done by undergraduate and MSc students.  Maybe because there are better tools and training material for building ML classifiers than ML NLG systems?  Also, I personally suspect (others may disagree) that building ML NLG systems is fundamentally a more difficult task than building ML classifiers, not least because the output is much more complex (texts vs categories).

Programmatic approaches such as simplenlg: You need to be a good programmer to use simplenlg and similar tools, and you also need some understanding of linguistic concepts (eg, know what a noun phrase is) and some background in NLG.  Good undergraduate and MSc students can have these skills, especially if they have done a module on NLG (or a module on NLP which includes a section on NLP).  In other words, if a person has decent software development skills and some understanding of baisic linguistic concepts, then we can fairly quickly teach this person to effectively use simplenlg and similar tools.

Of course, there is a different between “competent” use and “inspired” use, which indeed is true of all software development.  And inspired use may require more skills or training than competent use.  But still, I have often seen undergrad and MSc students build decent NLG systems using simplenlg; this is not true for ML.

Smart templates such as Arria Studio: I have seen a broad range of students, with diverse backgrounds and skill sets, create NLG systems using Studio.   The output texts are not ideal, and better students build better systems, but the key thing is that Studio is accessible and usable by a much wider variety of people than simplenlg or machine learning.  You dont need to be a good developer to use Studio, indeed many commercial Studio users are business analysts, not developers. And you dont need to understand linguistic concepts such as noun phrases (although this helps).


I am not in any way suggesting that people with excellent skills should *always* use machine learning, the right technology depends on the context!   As Puzikov and Gurevych pointed out, it can be much faster to build a system using smart-templating than machine learning, even if you have the skills and background needed to build an ML NLG system.

My point is that interest is rapidly growing in using NLG in real applications (which is great!) and this means that people with very different backgrounds are trying to build NLG systems.    In 2019, there are hundreds of organisations and individuals who want to build NLG systems.  I guesstimate that maybe 2% of them (mainly R&D groups in large companies such as EBay) have the skills and background to  build ML NLG systems, and  maybe 25% have the skills and background to use programmatic approaches; for the rest, smart templates are the only option.

Which means that if we want to encourage the success of real-world NLG applications, we need to develop all of the above types of technology.   And we need to steer people towards technologies they can succeed with, especially if they start with the impression (fed by media hype) that deep learning is always the best way to build AI systems.

One thought on “Skills Required to Use Different NLG Technologies

  1. Easiness of use, in particular for tools that help transform data into knowledge, is a valuable attribute when we think about Business Intelligence and Data Literacy. These areas try to create an environment where people that make decisions can easily access the information captured by their organization, so they can make more data informed decisions. And one of the barriers is exactly the difficulty to interpret and extract knowledge from data represented in formats that those people are not used to. So, NLG is highly relevant to those areas! Both for enabling business people to understand better their data, but mainly because the extracted information is represented as text, a, in some contexts, more accessible data representation.
    And when I think about end-to-end solutions, I also think about their lack of explainability, another highly desired attribute when we think about business.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s