Perhaps the most common reason for bad NLG output texts is low-quality input data. Ie, Garbage In, Garbage Out is true regardless of our technology.
Someone recently asked me for detaiuls of an experiment I did 12 years ago, and it was not easy to get this information, because I had not properly archived it. Lesson: properly archive detailed information about experimental design, material, results, etc.
I was recently asked by someone if it was possible to easily determine whether an NLP system was good enough for a specific use case. Currently this is very hard. Making it easy could be a “grand challenge” for evaluation!
I am now chair of ACL SIGGEN. I hope SIGGEN can help the NLG community by encouraging high-quality scientific research, strengthening interaction with the non-NLP world, and providing trusted unbiased information about NLG.
People who search for me in Google see a Google-generated box which incorrectly says that I am Israeli. Google has ignored my complaints about this; they dont seem to care about the accuracy of their content-production algorithm. Which is ethically pretty dubious!
In both NLG and MT contexts, deep learning approaches can result in texts which are fluent and readable but also incorrect and misleading. This is problematical if accuracy is more important than readability, as is the case in most NLG contexts.
I am very happy to be involved in a new project, PhilHumans, which is exploring how AI can help users interact with personal health apps.