Note: I have added a (***) to blogs which have been viewed 1000+ times
Building NLG Systems
- Accuracy Errors Go Beyond Getting Facts Wrong
- AI professionals also focus on change management
- Amateurs focus on models; professionals focus on data
- Bad Data Means Bad Output
- Boring uses of language models
- Can ChatGPT do Data-to-Text? (***)
- Care Needed in Analytics and Data Science!
- Challenges are Same for Neural and Rule NLG
- Challenges of Surface Realisation
- Check Out a New Dataset Before Using It
- Dealing with Edge Cases in NLG
- Difficult Words for Neural NLG Systems
- Does Deep Learning Prefer Readability over Accuracy?
- Does Quality Matter in Training Data?
- Election results: Lessons from a real-world NLG system
- Embedding Machine Learning in a Rules-Based NLG System (***)
- Generated Texts Must Be Accurate! (***)
- Hallucination in Neural NLG (***)
- How do I Build an NLG System: Requirements and Corpora (***)
- How do I Build an NLG System: Testing and Quality Assurance
- How do I Build an NLG System: Tools? (***)
- How Should Different NLG Components Add Value?
- INLG: What real-world NLG users want
- Is building neural NLG faster than rules NLG? No one knows, but I suspect not.
- Is GPT3 Useful for NLG? (***)
- Language is diverse!
- Learning does not require evaluation metrics
- Lexical Choice Needs Machine Learning!
- ML is Used More if it Does Not Limit Control
- Natural Language Generation and Machine Learning (***)
- NLG Systems Must be Customisable
- NLG vs Templates: Levels of Sophistication in Generating Text (***)
- NLG=Task+Data+Model/Alg+Eval
- Pain Points in Health NLG: Data, Evaluation, Safety
- Pragmatic correctness is a challenge for NLG
- Real-World Neural NLG
- Should I Use Deep Learning?
- Simple vs Complex Models
- Sports NLG: Commercial vs Academic Perspective
- Skills Required to Use Different NLG Technologies
- Summarisation datasets should contain summaries!
- Testing Multiple Hypotheses
- Texts should be adapted to users
- The story of simplenlg (***)
- Use Good Engineering Methodology When Building NLG Systems!
- Varying Words In NLG Texts
- We Need Robust Ways to Select Content of NLG Texts
- We need to understand what users want!
- What are the Problems with Rule-Based NLG?
- What Makes a Good Narrative?
- You Need to Understand your Corpora! The Weathergov Example (***)
Evaluating NLG Systems
- A Consumer Perspective on Evaluation
- Accuracy, Fluency, and Utility
- BLEU-Human Correlation is Increasing: What does this Mean?
- BLEU in Different Languages: Dont use it for German
- Evaluating Accuracy
- Evaluating factual accuracy in complex data-to-text
- Evaluation Grand Challenge: Is NLP System Good Enough for a Use Case?
- Evaluation in Medicine and NLG/NLP
- Exercise: Find Problems in an Evaluation
- Future of NLG evaluation: LLMs and high quality human eval?
- How to do an NLG Evaluation: Metrics (***)
- How to do an NLG Evaluation: Human Ratings in Artificial Context (***)
- How to do an NLG Evaluation: Human Ratings in Real-World Context
- How to do an NLG Evaluation: Task-Based (Extrinsic) Performance in Real-World Context
- How to Validate Metrics (***)
- How Would I Automatically Evaluate NLG Systems?
- Humans make mistakes too
- Is BLEU valid? First observations and concerns
- Keep Good Records of Your Experiments
- Lets use error annotations to evaluate systems!
- Mistakes in Evaluating ML
- MSc Course on Evaluating AI
- My Guidelines for Evaluating AI Systems
- Objective evaluation of NLG texts
- Please Use Two-Tailed P Values!
- Real-world utility is based on many things
- Research Ethics of A/B Testing
- Regression to Mean
- Shared Task on Evaluating Accuracy?
- Small differences in BLEU are meaningless
- Study Design for Systematic Review of BLEU Validity: Comments Welcome!
- Texts can be accurate but still not appropriate
- Types of NLG Evaluation: Which is Right for Me? (***)
- Use Proper Baselines!
- We need more extrinsic (task) evaluation!
- Why do we still use 18-year old BLEU? (***)
- Why doesnt BLEU work for NLG?
- Why is ROUGE so popular?
Academic Life
- Academic NLG should not fixate on end-to-end neural
- Academic Researchers Should be Scouts and Explorers
- Academic Teaching vs Commercial Training Courses
- ACL vs TACL Reviewing
- Apologies to my students for limited feedback!
- Best Papers I Read in 2020
- Can I present my paper twice?
- Challenging NLG datasets and tasks
- Commercial and Academic Perspectives on NLG (and AI?)
- Could there be fraud in NLP Research?
- Does chatGPT make leaderboards less meaningful?
- Doing Less
- Good Papers are Hard to Publish
- Engineering Perspective: Understand Issues, Find Simple Solution
- How can I tell if a paper is scientifically solid?
- How I Review Papers
- I dont like leaderboards
- I enjoy reviewing for TACL
- I’m Impressed by Capetown Uni’s Diversity
- Limits of pre-publication reviewing
- Managing Research Projects is Painful but Necessary
- More discussion, fewer papers at conferences?
- My PhD Students: Where Are They Now (June 2017) (***)
- Our 2022 Publications: NLG Evaluation, Requirements, Resources
- Please check the boring details in your paper!
- Publication Requirements for PhD Students
- Publish in Journals!
- Real-World Impact of Academic Research
- Reviewing has changed over the years; conferences need to change as well
- What are Benefits of Physical Conferences?
- What Should Academic NLP Researchers Focus on? (***)
- Why I do not Want to be a Co-author on Your Paper (***)
- “Will I Pass my PhD Viva” (***)
Other Topics
- Adding Narrative to a Covid Dashboard
- Bayesian vs Neural Networks (***)
- chatGPT in Health: Exciting if we ignore the hype
- chatGPT: Great science, unclear commercials, hate the hype (***)
- Come Join Us in Aberdeen!
- Conversational data-to-text
- Could NLG systems injure or even kill people?
- Do people “cheat” by overfitting test data (***)
- Do We Encourage Researchers to Use Inappropriate Data Sets? (***)
- Exciting NLG Research Topics (June 2017)
- Farewell to Richard Kittredge, pioneer in applied NLG
- Get You Hands Dirty!
- Google: Please Stop Telling Lies About Me (***)
- Has Neural NLG Become More Scientific?
- How accurate do chatGPT texts need to be?
- How do I Learn about NLG? (***)
- How do Users React to NLG?
- Human editing of NLG texts
- Language Grounding and Context (***)
- Lessons from 25 Years of Information Extraction
- Lets Use ML for Insights! (***)
- Lots about evaluation and methodology at INLG – Great!
- Many Papers on Machine Learning in NLP are Scientifically Dubious
- My Vision for SIGGEN
- New book on NLG?
- New project PhilHumans: Better interaction in personal health apps
- NLG and Explainable AI (***)
- Non-Experts Struggle with Information Graphics
- Notes from a Dev Conf: Sensible Attitude to Trendy AI Tech, Arria Presentations
- PhD on using AI/NLG to help cancer patients at home
- Product Descriptions
- Project and Research Fellow Position in Reproducibility of Human Evaluations
- Summarising Messy Data
- Tableau buys Narrative Science
- Text or Graphics?
- Response to Goldberg’s Blog on Deep Learning for NLG (***)
- Vision: NLG Can Help Humanise Data and AI
- Where is NLG Most Successful Commercially? (***)
- Why isnt Research Software such as BabyTalk Used?
- Why isnt there More Open-Source NLG Software? (***)
- Working in Universities vs Companies
- Writing NLG Pages for Wikipedia
Personal
- Cycling through Northern England
- Cycling through Southwest England
- Cycling Through Wales, England, and my Wife’s Family History
- Goodbye to a Synagogue
- Life is “Flat” under Lockdown
- My Father Takes Me to Mexico
- My Son Visits Home
- The Brexit Mess
Arria blogs written by me (these are intended for non-specialists)
- Chatbots are a great way to present insights!
- Choosing words to clearly describe data
- Corpus Analysis: A great way to understand what your NLG system needs to do
- Finding creative solutions to detect mistakes in neural-NLG narratives
- Humans post-editing NLG-generated narratives
- NLG drives consistent narratives
- NLG is different from other language technologies
- OpenAI GPT System: What does it do?
- The Grand Challenge for NLG: Making Data Accessible to Humans
- The power of words
- This 22-year-old book on NLG is still relevant
- Why neural language models don’t work well in NLG