Do AI Models Actually Understand Language?

5 min readJul 18, 2024

By now, most people are well aware of how proficient AI models like GPT-4, Claude, and Gemini are at tasks requiring various linguistic skills and knowledge. These language models can write essays, summarise complex documents, and even pass standardised tests, demonstrating impressive capabilities. But to what extent do models with such skills truly understand language, or are they merely exceptionally good at mimicking it?

I explored this question extensively in my PhD thesis, but in this blog post I aim to provide a more concise summary of the key takeaways, and hopefully nudge readers to think about these more fundamental and philosophical aspects of AI instead of the constant race to the top of the leaderboards.

What Do We Mean by “Understanding”?

Philosophers, linguists, and cognitive scientists have debated the nature of language understanding for centuries. Language understanding has traditionally been thought of a capability that only humans possess, however, recently language understanding has also been studied as a capability that machine learning models and AI systems could be said to achieve. I will not cover all the different accounts of human language understanding here, but will focus on two main perspectives that emerge from recent AI and natural language understanding (NLU) literature.

The two accounts are:

The Usage-Based Definition: Understanding is the ability to use language effectively in complex tasks. This approach assumes that language understanding can be measured through various proxy tasks, like question answering (QA) or natural language inference (NLI).
The Intent-Based Definition: Understanding goes beyond task performance. It involves grasping the speaker’s intent, like their underlying goals or emotions.

The usage-based definition is what is often implicitly assumed in NLU research. When companies and researchers compare their models’ language understanding capabilities with other models, this is mostly done by comparing scores on evaluation benchmarks and tasks. Consider this example from the BERT paper:

In order to train a model that understands sentence relationships, we pre-train for a binarized next sentence prediction task that can be trivially generated from any monolingual corpus […] Despite its simplicity, we demonstrate […] that pre-training towards this task is very beneficial to both QA and NLI.

In this example the authors implicitly assume that QA and NLI tasks are examples of tasks that require understanding, but they don’t give any explicit definition of language understanding.

The usage-based account of language understanding is problematic in the sense that it does not really specify what capabilities a model should have in order to understand language. Is a model that performs well in text summarisation but fails in other tasks able to understand language or do different tasks test different degrees of language understanding?

The intent-based definition assumes that language understanding requires intents. These intents are often taken to be about a mental representation, a model of the world, and this model is used in comprehending language and in communication. Successful communication, in this view, requires a grasp of the communicative intent of the speaker. There are various formulations of the intent-based approach. One clear example can be found form the work of psychologists Pettison and Radvansky:

During text comprehension, readers create mental representations of the described events, called situation models. When new information is encountered, these models must be updated or new ones created.

The main problem with the intent-based definition is that it relies on the definition of intent, which itself is a difficult term to define. In philosophy, intention is often defined as something like “the power of minds and mental states to be about, to represent, or to stand for things, properties, and states of affair” [Setiya 2022]. Could we ascribe intention to AI models given this definition?

The Limits of Current Benchmarks

Before exploring whether AI models could be said to have intents, let’s discuss the limitations of NLU evaluation benchmarks. Most NLU models are evaluated on benchmarks such as MMLU or SuperGLUE. These benchmarks measure performance on specific tasks, such as natural language inference or question-answering. High scores on these tasks are often seen as evidence of language understanding.

However, research suggests that these benchmarks may not be reliable indicators of true understanding. Studies have shown that NLU models can perform well on these tasks even when the word order of sentences is scrambled [Pham et al. 2020] or if words are removed from sentences [Talman et al. 2021, 2022]. These results suggest that models might be relying on statistical patterns and shortcuts rather than deep language understanding.

The Role of Intent

Given that the usage-based account does not really give us a definition of what language understanding is and, moreover, given that the current tasks and benchmarks don’t necessarily measure understanding, let’s focus a bit more on the intent-based account.

The intent-based definition of understanding presents a different challenge. Can AI models, which are fundamentally statistical machines, grasp the nuanced intentions behind human communication?

Some argue that they can. According to them, neural networks learn representations of the world from the data they’re trained on. These representations could be seen as a form of understanding. When a model generates a response, it’s drawing upon its learned representation of the world to predict the most appropriate answer. A clear argument in this direction is made by Ilya Sutskever in his fireside chat with the founder and CEO of NVIDIA, Jensen Huang:

[W]hat the neural net learns is some representation of the process that produced the text, and that’s a projection of the world.

Others argue that this is not true understanding. Human understanding involves a lifetime of embodied experiences and complex cognitive processes. While AI models can learn impressive representations, they might still lack the depth and richness of human understanding. An example of such an argument can be found from Yann LeCun’s recent tweet:

A New Kind of Understanding?

The rise of AI may force us to reconsider what it means to understand language.

Mitchell and Krakauer hypothesise in their recent survey that AI has created new forms of understanding. Where we have earlier considered language understanding to be only of the kind that we humans possess, there likely are other forms of language understanding that could more easily be attributed to artificial systems, like NLU models.

It is possible that AI models possess a unique form of understanding that is different from our own. This new kind of understanding might not involve emotions or consciousness, but it could still be incredibly powerful and valuable.

What’s Next?

So where do I stand in this debate? Do I think current AI models understand language? Before I can answer this question I think we need two things:

A clear definition of what language understanding (for AI) is
Better evaluation benchmarks that capture and measure understanding according to the above definition

The current evaluation benchmarks for natural language understanding models are clearly limited (as discussed above). To truly gauge a model’s understanding, we need better evaluation methods that go beyond simple task performance. These methods should assess a model’s ability to reason, use common sense knowledge, and interpret the subtle nuances of human communication.

We have started to explore different approaches to evaluate language models in our ELOQUENT Lab shared task. The first results will be presented later this year at CLEF 2024 and we plan to make the ELOQUENT Lab an annual shared task.

In addition to better evaluation benchmarks, we also need more fundamental research on the nature of language understanding and AI models’ capabilities to understand language.