Amateurs focus on models; professionals focus on data

Jan 14, 2020 ehudreiter2 Comments

There is a military saying that “amateurs discuss tactics, professionals discuss logistics”. Similarly I think AI professionals should focus on data more than models. I suggest four simple initial questions to ask about your data if you want to build an ML system.

Uncategorized

Care Needed in Analytics and Data Science!

Oct 24, 2019Oct 24, 2019 ehudreiterLeave a comment

It can be very exciting to apply powerful analytics and ML techniques to analyse data sets, but we need to be careful, otherwise we will make mistakes.

Uncategorized

Do We Encourage Researchers to Use Inappropriate Data Sets?

Aug 1, 2019 ehudreiter17 Comments

I’m beginning to think that in some ways the NLP community *encourages* researchers to use poor-quality or otherwise inappropriate data sets. Which is a truly depressing thought…

Uncategorized

Grand Challenge: Helping the General Public to Make Data-Based Decisions

Apr 29, 2019Apr 29, 2019 ehudreiterLeave a comment

15 years ago, I siad a grand challenge for CS/AI./NLG was to help the general public effectively understand and use data. Progress on this has been less than I hoped, but this remains a worthwhile and important challenge!

Uncategorized

Bad Data Means Bad Output

Mar 20, 2019 ehudreiter6 Comments

Perhaps the most common reason for bad NLG output texts is low-quality input data. Ie, Garbage In, Garbage Out is true regardless of our technology.

Ehud Reiter's Blog

Ehud's thoughts about Natural Language Generation. Also see my book on NLG.

Tag: data quality

Amateurs focus on models; professionals focus on data

Care Needed in Analytics and Data Science!

Do We Encourage Researchers to Use Inappropriate Data Sets?

Grand Challenge: Helping the General Public to Make Data-Based Decisions

Bad Data Means Bad Output