Texts produced by NLG systems can be evaluated in terms of accuracy (content is correct), fluency (text is readable), and utility (text is useful). I discuss these three “dimensions” of NLG evaluation.
I’ve been shocked by the fact that many neural NLG researchers dont seem to care that their systems produce texts which contain many factual mistakes and hallucinations. NLG users expect accurate texts, and will not use systems which produce inaccurate texts, not matter how well the texts are written,
I am getting so fed up with UK politics that I will break my “no politics” rule in this blog and express my frustration with the Brexit mess and the way politicians have handled it.
I’m impressed by the diversity of NLG researchers at Capetown, from a gender, race, and disability perspective. An inspiration the rest of us!
When we try to use ML in commercial NLG contexts, one of the challenges is that NLG developers want to be able to customise, configure, and control their systems. So we need ML approaches which do not stop devs from configuring things they are likely to want to change.
I’m beginning to think that in some ways the NLP community *encourages* researchers to use poor-quality or otherwise inappropriate data sets. Which is a truly depressing thought…
Some thoughts on key NLG challenges in explainable AI: evaluation, conceptual alignment, narrative. Comments are welcome!
Most research software does not enter everyday operational use. In part because research projects usually do not worry about issues such as maintainability, regulatory approval, and change management, which are essential to the long-term success of commercial software.
Some thoughts on the properties texts need to have in order to be good non-fictional narratives, and speculations on how we might generate such texts.
A travelogue about my recent cycling holiday, in Wales and England, where I saw many places related to my wife’s family history,