NLG Evaluation 2025 vs 2015: much improved but needs to be better
How has NLG evaluation changed in past ten years? Short answer is that tech is much better (eg, LLM-as-judge), but practice (eg experimental rigour) remains poor, and commercial interests are more prominent.