Are language models really intelligent?

Let’s imagine that we have a friend, his name is Gaspar Pérez Torres (aka GPT), a fan of Einstein and with an excellent memory. At a meeting, Gaspar perfectly recites the works of this famous scholar, impressing everyone who listens. However, we know that Gaspar’s skills are only memorizing and reciting.

Gaspar’s case is very similar to the questions currently being asked about language models, such as GPT-4 or Gemini: Are they really smart or do they just memorize and recite? Are we facing a technology that will compete with human intelligence?

_{By Pablo SenosiainNotus partner}

Apparent intelligence

Our friend Gaspar could surprise at a scientific conference, reciting Einstein’s research and answering questions with precision. Attendants and scientific experts might think he’s a genius, but he’s just reciting memorized information. If we continue this train of thought and extrapolate it to artificial intelligence, then language models, trained with a large amount of data, can appear intelligent when answering all the questions we ask them. But in reality, they only use patterns and data stored during their training.

Is GPT smart?

As we know, LLMs (Large Language Models) use their memory to solve problems, similar to how humans use learned patterns and how Gaspar can recite all of Einstein’s papers and leave everyone amazed. This can be considered intelligence, to some extent. However, humans, in addition to memorizing and learning patterns, also reason. Reasoning is a skill that allows adaptation to new situations using logic, inference and interpretation. Some studies show that LLMs, including GPT-4, are not capable of human-level reasoning or planning. That is, when they face new tasks, their performance drops significantly.

Reasoning

Human reasoning allows adaptation to new situations and scenarios, in a creative and flexible way. LLMs can give the illusion of reasoning because they have memorized many examples of reasoning, but they do not really understand or reason. In the millions of paragraphs of text used to train an LLM, there are enough samples or examples of reasoning that, when faced with certain logic or planning problems, the model can resort to a “memorized” reasoning scheme. That is, a kind of logical recipe is kept in memory. But, “learning to reason” through thousands of examples, that is, reasoning from memory, is not the same as what humans do when we reason.

HypeGPT5 and the future of artificial intelligence

Does it matter so much if current LLMs only partially represent the spectrum of human intelligence? It depends on who we ask. Today we are facing a wave of Hype extreme when it comes to artificial intelligence. Surely part of this is due to an overestimation of the current capabilities of LLMs, who, as we have explained, appear to be more intelligent and capable than they really are. In this, social networks and the viralization of the topic have not helped much. For investors and companies that can make a profit from this, high expectations and hype can help the business, attracting more investment and general user interest. Now, to be fair a large part of Hype It is also real and justifiable. If we look at the ability of LLMs to memorize and associate information on its own merit, this functionality is already revolutionary on its own. The use cases already developed seem to be straight out of a science fiction movie, and present gigantic potential for all areas of human activity. That is, the usefulness and value of LLMs is not in dispute.

However, from the scientific point of view (the community that wants to advance, to get closer to what human intelligence is), several opinion leaders assure that it will not be possible to advance much further with the current architecture of Transformers. For example, for Francois Chollet, AI researcher at Google, models like GPT 4 would be reaching a limit and it does not matter much if the new GPT 5 model continues to invest in size and/or training data. The models will not be more intelligent or capable of reasoning because they are larger, they will only gain more memory. On the other hand, Yann LeCun, one of the opinion leaders in AI and Meta researcher, advises young people not to continue researching LLMs, and instead recommends seeking new research frontiers that help unlock current limitations. More controversial opinions say that the Hype of LLM has captured so much the attention of companies, academia and the general public, that they have “consumed all the oxygen” available, harming other branches of research that could finally lead us to Artificial General Intelligence or AGI.

Finally, as a final reflection, while we wait for the next release of OpenAI with GPT-5 or a new revolutionary model architecture, it is essential that we now take a closer look at our own work and activities. That the models are still missing an important piece does not mean that we can underestimate their transformative capacity, through what they can already offer. We could say that LLMs have a fraction of our mental abilities, but with that fraction they are capable of surpassing us in multiple areas. This means that our roles and our jobs will inevitably evolve. How much of our daily life corresponds to rather mechanical and repetitive tasks? What percentage of our day is dedicated to applying a “program” or learned routine that is in our memory? The answers to these questions will help us anticipate the opportunities (and risks) that we may have in the near future.