Adding Narrative to a Covid Dashboard
The Tibco Covid dashboard is a nice example of how NLG narratives can “add value” to complex visualisations. Hopefully we’ll see more dashboards like this!
The Tibco Covid dashboard is a nice example of how NLG narratives can “add value” to complex visualisations. Hopefully we’ll see more dashboards like this!
A colleague asked me if it was true that building neural NLG systems was faster than building rule-based NLG systems. The answer is that we dont know, because we dont have good data on this question. However the weak evidence we do have suggests that building rules-based NLG is no slower and may be faster than building neural NLG, at least for data-to-text systems.
Accuracy errors in NLG texts go far beyond simple factual mistakes, for example they also include misleading use of words and incorrect context/discourse inferences. All of these types of errors are unacceptable in most data-to-text NLG use cases.
The Covid crisis gives us a chance to rethink and change our conference culture. I would like to see fewer large international conferences, and also have these focus on discussion and interaction rather than on oral presentations.
A PhD student recently complained that to me that he was wasting a lot of time reading scientifically dubious papers. I give some suggestions on indicators of poor scientific quality in research papers.
I’d love to see more people using machine learning to provide insights about NLG problems and related linguistic issues. I personally think this is much more useful than tweaking models to show a 1% increase in state-of-art in a very artificial context.
NLP technology has changed and advanced over the past two decades, but it often seems that NLG evaluation has not. Why is the 18-year old BLEU metric still so dominant?
We’re thinking of organising a shared task on evaluating the accuracy of texts produced by NLG systems. Comments welcome, also let me know if you might participate.
NLP in 2020 is dominated by papers which report small improvements in state-of-art. I suspect that a lot of these improvements are due to overfitting test data, not to genuine scientific advances.
If we want to deploy AI in the real world, we need to think about “change management” issues. Eg if users think that AI threatens their jobs or adds extra hassle, then uptake will be slow. This has been a problem for AI and statistical algorithms since the 1950s.