how to create an AI for the different variations of Arabic

Stories from the Middle East about the complexity of creating AI tools for Arabic, a language with many facets

Galaxy AI now supports 16 languages, helping more people reduce language barriers with real-time, on-device translation. With these advancements, Samsung ushered in a new era of mobile Artificial Intelligence (AI), so we’re visiting Samsung research centers around the world to learn how Galaxy AI came to life and what it took to overcome the challenges. challenges of AI development. While the first part of the series examined the task of determining what data is needed, this installment reflects on the complex work of taking dialects into account.

Teaching a language to an AI model is a complex process, but what if it is not a singular language, but a collection of different dialects? That was the challenge faced by the Samsung Research and Development (R&D) Institute Jordan (SRJO) team. Although “Arabic” was added as a language option for Galaxy AI features like Live Translate, the team had to study the different Arabic dialects that span the Middle East and North Africa, each of which varies in pronunciation , vocabulary and grammar.

Arabic is one of the six most spoken languages in the world, used daily by more than 400 million people [1]. The language is classified into two forms: Fus’ha (Modern Standard Arabic) and Ammiya (the dialects of Arabic). Fus’ha is normally used in public and official events, as well as in news events, while Ammiya is more common in everyday conversations. More than 20 countries use Arabic and there are currently about 30 dialects in the region.

unwritten rules

The SRJO team, aware of the variants that these dialects present, used a series of techniques to discern and process the unique linguistic features inherent to each of them. This approach was crucial to ensuring Galaxy AI could understand and respond in a way that accurately reflected regional nuances.

“Unlike other languages, the pronunciation of the object in Arabic varies depending on the subject and verb of the sentence.“explains Mohammad Hamdan, project manager of the Arabic language development team. “Our goal is to develop a model that understands all these dialects and can respond in standard Arabic“.

TTS is the component of Galaxy AI’s Live Translate feature, which allows users to interact with people of different languages by translating spoken words into written text and then reproducing them by voice. The TTS team faced a unique challenge due to a peculiarity of working with Arabic.

Arabic uses diacritics, which are guides to the pronunciation of words in some contexts, such as religious texts, poetry, and books for language learners. Diacritics are widely understood by native speakers, but are absent in everyday writing. This makes it difficult for a machine to convert raw text into phonemes, the basic units of sound that make up speech.

“There is a lack of reliable, high-quality data sets that accurately represent the correct use of diacritics.“Haweeleh explains. “We had to design a neural model that could predict and restore those lost diacritics with high accuracy.“.

Neural models function similarly to human brains. To predict diacritics, a model has to study many Arabic texts, learn the rules of the language, and understand how words are used in different contexts. For example, the pronunciation of a word can vary greatly depending on the action or gender it describes. The team’s extensive training was the key to improving the accuracy of the Arabic TTS model.

Improve understanding

The SRJO team also had to collect different audio recordings of the dialects from various sources, which had to be transcribed, focusing on unique sounds, words and phrases. “We assembled a team of native speakers of the dialects who knew the nuances and variants well.“says Ayah Hasan, whose team was responsible for creating the database. “They listened to the recordings and manually converted the spoken words into text“.

This work was instrumental in improving the automatic speech recognition (ASR) process so that Galaxy AI could handle the wide variety of Arabic dialects. ASR is essential for Galaxy AI to understand and respond in real time.

“Building an ASR system that supports multiple dialects in a single model is a complex undertaking“says Mohammad Hamdan, ASR manager on the project. “It requires a deep understanding of the intricacies of the language, careful data selection, and advanced modeling techniques.“.

The culmination of innovation

After months of planning, building and testing, the team was ready to launch Arabic as a language option for Galaxy AI, allowing many more people to communicate across borders. This unique team made Galaxy’s AI services available to Arabic speakers, reducing linguistic and cultural barriers between them and people around the world. In doing so, it established new good practices that can be extended around the world. This success is just the beginning: the team continues to refine its models and improve the quality of Galaxy AI’s linguistic capabilities.

In the next episode, we will go to Vietnam to see how language data is improved. Furthermore, what does it take to train an effective AI model?

Arabic is one of the languages and dialects available with Galaxy AI and can be downloaded from the Settings app. Galaxy AI language features such as Simultaneous Translation and Interpreter are available on Galaxy devices with Samsung’s One UI 6.1 update.[2].

[1] UNESCO, World Arabic Language Day 2023, https://www.unesco.org/en/world-arabic-language-day

[2] One UI 6.1 was first released on Galaxy S24 series devices with a wider rollout to other Galaxy devices including S23, S23 FE, S22, S21, Z Fold5, Z Fold4, Z Fold3, Z Flip5, Z series Flip4, Z Flip3, Tab S9 and Tab S8.

For Latest Updates Follow us on Google News