Last week I attended and spoke at the AI Summit in New York. It was the first time I’ve attended and spoken at a commercially-focused AI conference, instead of an academic or technical conference. Overall a worthwhile and interesting experience, but I was annoyed by one thing, which was the prominence and indeed worship of deep learning, mostly by people who didnt have a clue what it meant but were very tuned in to the latest “whats hot” hype from AI gurus.
Now, I think deep learning is an interesting technology which has done some impressive things in computer vision and some NLP tasks such as machine translation. But its hardly an AI panacea, and the quality of deep learning research in the field I know most about, NLG, is definitely mixed. There is some good work, especially in NLG-and-vision contexts, such as image captioning. But there is also a lot of unimpressive work which either focuses on easy NLG tasks such as generating point weather forecasts and/or is of poor scientific quality. I am willing to be convinced that deep learning is a great way to do NLG, but I’m still waiting for scientifically solid evidence that deep learning can help solve the hard problems in NLG which I care about, such as generating high-quality narratives.
I also was alarmed at some of the “advice” that was given to people keen on deep learning, such as “be sure to go to NIPS to learn about the latest innovations”. Frankly, for the majority of companies who want to use deep learning, the most important thing is to get a solid understanding of the basics (data cleansing, hyperparameter tuning, etc). Trying to follow the latest NIPS papers is distracting at best and at worst could derail the entire effort. Especially if these papers are as scientifically weak as many of the ones I have seen on using DL in NLG. But “you need to get the basics right” is presumably a less sexy message than “you need to keep up-to-date with the latest fantastic developments in DL technology”.
Is Deep Learning for Me?
If someone asked me if they should use deep learning (for any AI application, not just NLG), I would ask them the following questions:
- Do you have enough training data? All machine learning requires training data, and deep learning requires more than many other ML techniques. If you have a million training examples, you are probably OK. If you have ten, dont even consider DL. I have heard some AI vendors claim that they can learn from 5 examples; I’ll believe this when I see actual evidence that it can be done, as opposed to vague marketing claims.
- Does your system need to guarantee a minimal level of performance? Machine learning systems in general have poor “worst case” behaviour (and are difficult from a quality-assurance perspective), and again deep learning is worse than many other ML techniques in this respect. If you just need good performance in most cases and can accept occasional nonsense, this is not a problem. But if your system has to work every time, then be cautious. This is one reason why self-driving cars are much more challenging from an ML perspective than machine translation; its OK for an MT system to produce garbage once in a while, but its not acceptable for a self-driving car to crash into a lorry even once.
- Will you need to update or tweak your system’s behaviour? In medicine, for example, decision-support systems need to be updateable to take into account the latest research findings, product releases, and regulatory changes. Its impossible to manually tweak or update a deep learning system. It is possible to retrain the system on new training data which incorporates the new regulations (etc), but we are unlikely to have such training data for new regulations, products, etc. This isnt a problem for applications which are relatively static (again MT is a good example), but is a problem for applications that need to adjust to a fast-changing world.
- Do people need to trust the AI system? Its difficult for people to trust recommendations and decisions which they do not understand, especially if they are expected to assume legal responsibility for these decisions (as is the case for a doctor using an AI decision support system, for example). Deep learning is notoriously difficult from this perspective, in part because DL systems “think” in a very different way than people do.
- Do you know how to build DL systems? Building good deep learning systems is currently a craft, not a science. You arent going to learn how to build DL systems by watching a few 10-minute YouTube videos, or even by reading a textbook on deep learning. There are plenty of consultants who claim to be knowledgeable and will sell you their services; some of these people are excellent, but some are not.
- Do you need deep learning? Last but not least, do you even need deep learning (or indeed AI) to build your system? AI is great, but the current levels of hype in some cases encourage people to look for AI-based solutions to problems which could be solved with a database and some old-fashioned business logic.
I try to think positively about deep learning, as an exciting new AI technology which we are still exploring and understanding. But the ridiculous level of hype around deep learning does leave a sour taste in my mouth…