Most think that language evolved as a way for people to exchange information, however, linguists and other communication students have long reasoned over why language evolved. Famous linguists, amongst them MIT’s Noam Chomsky, have debated that language is actually badly designed for communication and state that it is only a byproduct of a system that may have evolved for other reasons, maybe for structuring our own private thoughts.

As proof for their theory, these linguists highlight the fact that language is ambiguous. They claim that in a scheme, which is optimized for passing information between a speaker and a listener each word would only have one meaning to avoid any risk of confusion or misunderstanding. In a study published in the journal Cognition a team of MIT cognitive scientists has now upturned the linguists hypothesis with a new theory, which argues that ambiguity makes language in fact more efficient as it permits the reuse of short, efficient sounds that listeners can easily distinguish depending on the context.

Senior author of the study Ted Gibson, an MIT professor of cognitive science says:

“Various people have said that ambiguity is a problem for communication. But once we understand that context disambiguates, then ambiguity is not a problem – it’s something you can take advantage of, because you can reuse easy [words] in different contexts over and over again.”

The word “Mean” for instance is a rather ironic example of ambiguity, as it can obviously stand for indicating and signifying something, yet it can also refer to an intention or purpose, for instance as in “I meant to go to the store”. It could be another word for something or someone offensive or nasty, as well as referring to the ‘mathematical average’, and just by adding an ‘s’ at the end of the word makes the definition even more versatile, for example, “a means to an end” refers to an instrument or method, or financial management, as in “to live within one’s means”.

Given all these different definitions, literally no one who masters the English language gets confused when hearing the word “mean.” The reason is that the different senses of the word occur in very different contexts, which enables listeners to interpret its meaning almost automatically.

The researchers believe that the simplest words for language processing systems most probably exist because of this disambiguating power of context, which may restrain the ambiguity of languages to reuse words.

Based on previous studies and on observations they suggest that words with fewer syllables, high frequency and the simplest pronunciations should have the most meanings.

To examine their theory, the researchers conducted corpus studies in Dutch, English and German. A corpus study is the study of language based on “real life” language examples that are stored in corpora (or corpuses), i.e. computerized databases created for linguistic research.

Their theory that shorter words that occurred more frequently and conformed to the language’s typical sound patterns tend to be ambiguous was confirmed, when they compared certain properties of words to their numbers of meanings. They observed that the trends were statistically important in all three languages.

In order to comprehend why ambiguity makes a language more instead of less efficient, one has to examine the competing desires of a speaker and listener. Whereas a speaker wants to put across as much as possible to a listener with as few words as possible, the listener aims to gain a complete and specific understanding of what the speaker is trying to convey. However, as the researchers have already pointed out, it is “cognitively cheaper” if the listener concludes certain things from the context of the conversation, rather than the speaker having to spend more time on longer and more elaborate descriptions.

The result is a system that leans toward ambiguity by reusing the “easiest” words. Piantadosi states that once the context is taken into account, it becomes clear that “ambiguity is actually something you would want in the communication system.”

According to the researchers, the statistical nature of their paper demonstrates a trend in the field of linguistics that is starting to depend more heavily on information theory and quantitative methods.

Gibson states that, “The influence of computer science in linguistics right now is very high,” and adds that natural language processing (NLP) is a major objective of those who operate at the intersection of the two fields.

Piantadosi highlights that ambiguity in natural language presents enormous challenges for NLP developers, saying:

“Ambiguity is only good for us [as humans] because we have these really sophisticated cognitive mechanisms for disambiguating. It’s really difficult to work out the details of what those are, or even some sort of approximation that you could get a computer to use.”

However, as Gibson pointed out, this problem has long been known by computer scientists and even though the new study offers a better theoretical and evolutionary explanation as to why ambiguity exists, the fact that, “Basically, if you have any human language in your input or output, you are stuck with needing context to disambiguate,” still exists he says.

Written by Petra Rattue