Skip to content

Ehud Reiter's Blog

Ehud's thoughts about Natural Language Generation. Also see my book on NLG.

  • Home
  • Blog Index
  • About
  • What is NLG
  • Publications
  • Resources
  • University
  • Book
  • Contact

Tag: data quality

Uncategorized

Amateurs focus on models; professionals focus on data

Jan 14, 2020 ehudreiter2 Comments

There is a military saying that “amateurs discuss tactics, professionals discuss logistics”. Similarly I think AI professionals should focus on data more than models. I suggest four simple initial questions to ask about your data if you want to build an ML system.

Uncategorized

Care Needed in Analytics and Data Science!

Oct 24, 2019Oct 24, 2019 ehudreiterLeave a comment

It can be very exciting to apply powerful analytics and ML techniques to analyse data sets, but we need to be careful, otherwise we will make mistakes.

Uncategorized

Do We Encourage Researchers to Use Inappropriate Data Sets?

Aug 1, 2019 ehudreiter17 Comments

I’m beginning to think that in some ways the NLP community *encourages* researchers to use poor-quality or otherwise inappropriate data sets. Which is a truly depressing thought…

Uncategorized

Grand Challenge: Helping the General Public to Make Data-Based Decisions

Apr 29, 2019Apr 29, 2019 ehudreiterLeave a comment

15 years ago, I siad a grand challenge for CS/AI./NLG was to help the general public effectively understand and use data. Progress on this has been less than I hoped, but this remains a worthwhile and important challenge!

Uncategorized

Bad Data Means Bad Output

Mar 20, 2019 ehudreiter6 Comments

Perhaps the most common reason for bad NLG output texts is low-quality input data. Ie, Garbage In, Garbage Out is true regardless of our technology.

  • LinkedIn
  • Twitter

News: I am likely to retire in summer 2026. Looking for interesting things to do afterwards.

Top Posts & Pages

  • Retirement Plans: Travel and some academics
  • What LLMs cannot do
  • Even good leaderboards may not be useful, because they are gamed
  • Types of NLG Evaluation: Which is Right for Me?
  • Hallucination in Neural NLG
  • Most common uses of AI in Healthcare
  • Blog Index
  • Do a sanity check on your experiments
  • Do LLM coding benchmarks measure real-world utility?
  • I'm very worried about data contamination
Blog at WordPress.com.
  • Subscribe Subscribed
    • Ehud Reiter's Blog
    • Join 102 other subscribers.
    • Already have a WordPress.com account? Log in now.
    • Ehud Reiter's Blog
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar