Last week I gave an invited talk at a workshop on computational pragmatics, which was mainly attended by linguists and psychologists. I wanted to explain how my perspective differed from theirs, and I ended up saying that I was an engineer, who
- Tried to understand the underlying linguistic, AI, and user issues, by looking at real-world language use (eg, corpora) and talking to real-world users, and then
- looked for simple engineering solutions to these problems.
This is in contrast to the linguists and psychologists at the workshop, who mostly
- Did controlled experiments with subjects, ranging from asking subjects about the appropriateness of sentences to looking at EEG data, and then
- built models and theories of language use based on these experiments.
It is also in contrast to most machine learning people, who
- Gather corpora (usually real-world but sometimes synthetic), and then
- determine which ML architecture does best at replicating the corpora
I think all of the above are scientifically worthy endeavours, provided the science is rigorously and properly done. But I also think that my emphasis really is a bit different from the above linguists, psychologists, and machine learning experts.
I gave a few examples at the workshop. The first occurred when we were building the SumTime weather forecast generator. I spent a lot of time both analysing corpora (manual and machine-learning) and talking to forecast readers and writers, and from this work realised that there was a lot of variation in the words used by different weather forecasters, and the readers didnt like this. For instance, the time phrase “by evening” was used by some forecasters to mean 1800 and others to mean 0000, which made it difficult for forecast readers to figure out what time a forecast was referring to. Anyways, having done a lot of analysis to identify the problem, we then decided on a very simple engineering solution, which was to avoid inconsistently used time phrases such as by evening and instead use consistent time phrases such as by midnight even if they occurred less often in the corpus. This strategy was very successful, and contributed to SumTime’s high quality forecasts, which indeed were sometimes considered (by forecast readers) to be better than human-produced forecasts.
Another example I talked about was Ross Turner’s work on choosing frames of reference in geographic descriptions. Ross did a lot of empirical analysis as above, and realised that reference frames had to make sense causally. For instance we cannot say “Unemployment increased in areas above 100m“, even if this is an accurate description of where unemployment increased, because people look for a causal connection between unemployment and altitude. We can however say “Rain expected in areas above 100m“, because this makes sense causally. Similarly, we can say “Unemployment increased in rural areas” but not “Rain expected in rural areas“. But again, after doing a lot of analysis to identify an interesting and unexpected linguistic problem, Ross then decided on a very simple engineering solution, which was to only use reference frames (eg altitude or rural/urban) which occurred in the corpus for the phenomenon being described. So we can use altitude with rain because the corpus contains altitude-based descriptions of rain, but we cannot use rural/urban with rain because the corpus does not contain any rural/urban-based descriptors of rain.
In the above examples (I could list many others) we did a lot of analysis (focusing on corpora and working with users) to identify underlying issues, such as forecasters using words inconsistently or readers expecting a causal rationale behind geographic descriptors. Having identified the issues, we looked for simple engineering solutions. We did not try to create generic models or theories of language, and we did not propose complex algorithmic/ML architectures for solving problems if simple (and algorithmically trivial) approaches worked.
I realised at the workshop that this kind of engineering perspective is perhaps a bit unusual in academic researchers. I certainly appreciate that most academics want to create broadly applicable models, theories, and architectures, and its fantastic when this is possible! But I sometimes wonder if the desire for this leads people to extrapolate from insufficient data, and build complex linguistic theories from a small set of examples; DRT (based on so-called “donkey” sentences) perhaps suffers from this. Also the desire of ML researchers to propose novel architectures in some cases perhaps leads them to propose complex deep-learning architectures for problems which can be solved by a simple-but-informed baseline (one example is described in Chen et al 2016).
Anyways, I think there is value in the engineering approach described above, and that it can teach us interesting things about language (through the issues it uncovers) as well as help us build useful systems. But of course there is value in the other approaches as well!