Skip to content

Ehud Reiter's Blog

Ehud's thoughts about Natural Language Generation. Also see my book on NLG.

  • Home
  • Blog Index
  • About
  • What is NLG
  • Publications
  • Resources
  • University
  • Book
  • Contact

Tag: corpora

Uncategorized

Lessons from 25 Years of Information Extraction

Jan 2, 2020Jan 2, 2020 ehudreiter1 Comment

I really liked Grishman’s recent paper on 25 years of research in information extraction, and summarise a few of the key insights here, about relative progress in different areas of NLP, reluctance of researchers to use complex evaluation techniques, and corpus creation vs rule-writing.

Uncategorized

ML is Used More if it Does Not Limit Control

Aug 15, 2019 ehudreiterLeave a comment

When we try to use ML in commercial NLG contexts, one of the challenges is that NLG developers want to be able to customise, configure, and control their systems. So we need ML approaches which do not stop devs from configuring things they are likely to want to change.

Uncategorized

Many Papers on Machine Learning in NLP are Scientifically Dubious

Jun 6, 2018 ehudreiter1 Comment

In response to a previous blog, many people expressed concerns to me about the quality of many papers they saw on ML in NLP. I summarise some of these concerns, which are worrying.

Uncategorized

You Need to Understand your Corpora! The Weathergov Example

May 9, 2017May 9, 2017 ehudreiter8 Comments

People who use corpora to build NLG systems need to understand what is in the corpora. The widely used Weathergov corpus, for example, probably contains computer-generated texts rather than human-written texts. So learning from it is essentially reverse-engineering a rule-based NLG system.

  • LinkedIn
  • Twitter

News: I am likely to retire in summer 2026. Looking for interesting things to do afterwards.

Top Posts & Pages

  • Retirement Plans: Travel and some academics
  • What LLMs cannot do
  • Even good leaderboards may not be useful, because they are gamed
  • Types of NLG Evaluation: Which is Right for Me?
  • Hallucination in Neural NLG
  • Most common uses of AI in Healthcare
  • Blog Index
  • Do a sanity check on your experiments
  • Do LLM coding benchmarks measure real-world utility?
  • I'm very worried about data contamination
Blog at WordPress.com.
  • Subscribe Subscribed
    • Ehud Reiter's Blog
    • Join 102 other subscribers.
    • Already have a WordPress.com account? Log in now.
    • Ehud Reiter's Blog
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar