I’m looking for a PhD student to work on Advanced Data Storytelling!
Some explanation and advice about regression to mean, which is a statistical phenomena that can impact NLG evaluations.
People who use corpora to build NLG systems need to understand what is in the corpora. The widely used Weathergov corpus, for example, probably contains computer-generated texts rather than human-written texts. So learning from it is essentially reverse-engineering a rule-based NLG system.
I am really dubious about evaluations based on BLEU and other metrics. I explain why, and also give advice on best practice for people who are committed to using metrics