Privacy Policy Banner

We use cookies to improve your experience. By continuing, you agree to our Privacy Policy.

Beyond words: how the melody in speech can teach emotions to AI

Researchers have identified between 200 and 350 basic prosodic patterns that are essential to understand the structure and meaning in spoken language. (Infobae Illustrative Image)

*This content was produced by experts from the Weizmann Institute of Sciences, one of the most important centers in the of multidisciplinary basic research in the field of natural and exact sciences, located in the city of Rejovot, Israel.

The revolution of the artificial intelligencewhich has begun to transform our lives in the three years, is based on a fundamental linguistic principle which is at the base of large -scale language models such as Chatgpt. Words in a natural language are not combined at random; Rather, there is a statistical structure which allows the to guess the following word based on what came before.

However, these models overlook a crucial dimension of the communication: The content that is not transmitted through words. In a new study that is published today in Proceedings of the National Academy of Sciences, USA (PNAS), Researchers at the Laboratory of Prof. Elisha Moses in the Weizmann Sciences Institute They reveal that the speech melody In spontaneous English conversations it works as a different languagewith a “vocabulary” of hundreds of basic melodies and even Syntax rules that can predict the following melody in the sequence. The study feels the basis for artificial intelligence that will understand language beyond words.

The melody Speech, known in linguistics as “prosody”, covers variations in tone (intonation), volume (for example, to emphasize), rhythm and sound quality (such as a whisper or a cracked voice). This form of expression precedes words in evolution: recent studies reveal that both Chimpanzees As whales incorporate complex prosodic structures in their communication.

The prosody of speech, which
The prosody of speech, which includes rhythm, tone and intensity, plays a crucial role in human communication, functioning as a different and essential language in everyday conversations. (Infobae Illustrative Image)

In human communication, prosody adds a nuanced layer of meaning beyond words. A brief pause, like a comma, can the meaning of a prayer (“let’s eat, grandmother”) and the rhythm of the spoken text can generate suspense. Linguists specialized in prosody have traditionally studied literary texts and the ways in which prosody reflects Historical changes.

This meant that, despite the critical importance of prosody for the understanding of human language, its study remained in a marginal field, without applications and of contradictory ideas about the structure and meaning of prosody.

However, prosody is an inherent part of each conversation. Assign one Linguistic function To words, for example, if they are formulating a question or affirming a fact, and reveals the attitude of the speaker towards what he is saying.

In the new study, led by linguist Dr. Nadav Matalon and the neuroscientist Dr. Eyal Weinreb, both of the Moses Laboratory in the Department of Physics of Complex Systems of Weizmann, the researchers analyzed prosody as an unknown language, with the aim of offering an explanation based on data from the linguistic mystery of the structure and meaning of prosody.

A team of scientists from
A team of scientists from the Weizmann Institute uses artificial intelligence to analyze conversations and discover melodic patterns that could revolutionize human communication. (Infobae Illustrative Image)

Instead of based on literature, they used two large collections of spontaneous conversations recordings: one of telephone conversations between two participants and another of face to face in various places, such as a kitchen or a classroom.

The task for the research team was to compile a Dictionary of melodies that function as “words” in the prosody of English and assign them a function and meaning. “To understand why there is still no prosodic dictionary, it is worth remembering that there was not even a complete dictionary of English until the nineteenth century,” says Moses.

“When the University of Oxford was in charge of compiling one, he asked the people to help with the workload by sending appointments that showed the historical changes in the meaning of the words. One of the main collaborators was a prisoner that spent more than 20 years reading books and sending appointments. In our study, instead of collecting information for ourselves over decades, we analyze large collections of audio recordings, using IA”.

The speech melody of each person is uniquebut the AI ​​model found several hundred basic patterns that are repeated, with slight variations, in all spontaneous conversations in English. While written words are letters of letters, a prosodic “word” is a Short melodythat is, a short sequence of sounds with variation in the tone, which lasts approximately an average .

(IZQ.) Dr. Dominik Freche, Prof.
(IZQ.) Dr. Freche, Prof. Elisha Moses, Dr. Nadav farm, Dr. Eyal Weinreb y Ophira Blumner Crédito: Gentile Institute descurity Weizmann

To discover the meaning of these “words,” Matalon took a sample of 20 basic melodic patterns and then listened to the recordings again. “We discover that each pattern has several linguistic functions”, He explains.“ For example, depending on the context, a pattern can define whether someone is asking a question or making a statement.

However, each pattern typically transmits a specific attitude of the speaker, such as curiosity, surprise or confusion, towards what is being said. A common prosodic word is a Pronounced of the tone followed by a quick fall. This pattern indicates enthusiasm and, depending on the context, can express a strong agreement or the recognition of receiving new important information. ”

“The first complete dictionary of Oxford’s English appeared in the 19th century, with the help of the public to manage the workload, including a prisoner who contributed for 20 years.” Next, the researchers tried to identify the Syntactic rules that govern the order of these prosodic patterns, which could allow future language learning models to understand and use prosody. “We noticed that there are patterns that tend to appear together, in paresin spontaneous speech, ”explains Weinreb.

The study also found that
The study also found that prosody varies according to social status and age, which shows how different populations have their own melodic patterns. (Infobae Illustrative Image)

“It is a simple statistical system, in which the correct choice of the following unit in a sequence depends solely on the previous one. This system works well for spontaneous conversation because it requires only planning a few seconds in , which is the that the short -term memory lasts.” These patterns pairs, the researchers discovered, function as Simple sentencesexpressing “a new idea”, so that each pair is related to a specific topic, adding a single piece of information about it, for example, referring to a fact mentioned in the conversation and providing positive feedback.

“This study feels the foundations for the of an automated system that compiles a ‘dictionary’ of prosody and identify your Syntactic rules For each human language and for different populations of speakers, ”says Moses.

“Prosody can vary depending on estate socialhistorical events and the age of the speakers, and these variations can even manifest in literary works that carefully reflect spontaneous speech, ”adds Matalon.“ We analyze audiobooks as part of the study and we discover that the prosodic patterns are longer in written speech and that simple syntax matched from the spontaneous conversation has disappeared.

Artificial intelligence, such as
Artificial intelligence, such as virtual attendees, could improve their emotional and empathic interaction if it incorporates the understanding of prosody, according to the study. (Infobae Illustrative Image)

There are also other differences. It is safe to assume that the of aging and the acquisition of language in childhood are also accompanied by quantifiable prosodic changes. In addition, there is evidence that prosody is important in the Internal speechthe language of thought, and that we can deepen our understanding of the existing prosody in the robotic voices produced by speech generating devices. The model we create promises to close the gaps that have emerged over the centuries in the investigation of the expression beyond words. ”

An important future application of a Automated Prosodia Dictionary It could be the development of an AI capable of understanding and transmitting messages through the speech melody instead of just with words. “Imagine if Siri could understand from the melody of your voice how you feel about a certain topic, what is important for you or if you think you know more than her,” adds Weinreb, “and that she could adapt her answer so that it sounds enthusiastic or sad. significant of human expression that robotic systems currently lack. ”

Dr. Dominik Freche of the Department of Complex Systems of Weizmann also participated in the study; Dr. Erez Volk of Neuralight Inc., Tel Aviv; Dr. Tirza Biron of the Department of Computer Science and Mathematics applied to Weizmann; and Prof. David Biron of the University of Chicago. *Prof. Elisha Moses occupies the Maurice and Ilse Katz chair.

-

-
PREV Rockstar announces the official date of GTA 6
NEXT The mysterious underground thinning that threatens North America