Lessons from 25 Years of Information Extraction

Jan 2, 2020Jan 2, 2020 ehudreiter1 Comment

I really liked Grishman’s recent paper on 25 years of research in information extraction, and summarise a few of the key insights here, about relative progress in different areas of NLP, reluctance of researchers to use complex evaluation techniques, and corpus creation vs rule-writing.

Uncategorized

ML is Used More if it Does Not Limit Control

Aug 15, 2019 ehudreiterLeave a comment

When we try to use ML in commercial NLG contexts, one of the challenges is that NLG developers want to be able to customise, configure, and control their systems. So we need ML approaches which do not stop devs from configuring things they are likely to want to change.

Uncategorized

Many Papers on Machine Learning in NLP are Scientifically Dubious

Jun 6, 2018 ehudreiter1 Comment

In response to a previous blog, many people expressed concerns to me about the quality of many papers they saw on ML in NLP. I summarise some of these concerns, which are worrying.

Uncategorized

You Need to Understand your Corpora! The Weathergov Example

May 9, 2017May 9, 2017 ehudreiter8 Comments

People who use corpora to build NLG systems need to understand what is in the corpora. The widely used Weathergov corpus, for example, probably contains computer-generated texts rather than human-written texts. So learning from it is essentially reverse-engineering a rule-based NLG system.

Ehud Reiter's Blog

Ehud's thoughts about Natural Language Generation. Also see my book on NLG.

Tag: corpora

Lessons from 25 Years of Information Extraction

ML is Used More if it Does Not Limit Control

Many Papers on Machine Learning in NLP are Scientifically Dubious

You Need to Understand your Corpora! The Weathergov Example