Skip to content
GCC AI Research

Search

Results for "ALLM"

ALLaM: Large Language Models for Arabic and English

arXiv ·

The paper introduces ALLaM, a series of large language models for Arabic and English, designed to support Arabic Language Technologies. The models are trained with language alignment and knowledge transfer in mind, using a decoder-only architecture. ALLaM achieves state-of-the-art results on Arabic benchmarks like MMLU Arabic and Arabic Exams. Why it matters: This work advances Arabic NLP by providing high-performing LLMs and demonstrating effective techniques for cross-lingual transfer learning and alignment with human preferences.

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

arXiv ·

This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.

Introducing the Open Arabic LLM Leaderboard: Empowering the Arabic Language Modeling Community

TII ·

The Open Arabic LLM Leaderboard (OALL) has been launched to benchmark Arabic language models, addressing the gap in resources for non-English NLP. It incorporates datasets like AlGhafa, ACVA, and translated versions of MMLU and EXAMS from the AceGPT suite. The leaderboard uses normalized log likelihood accuracy for tasks, built around HuggingFace’s LightEval framework. Why it matters: This initiative promotes research and development in Arabic NLP, serving over 380 million Arabic speakers by enhancing the evaluation and improvement of Arabic LLMs.

The Landscape of Arabic Large Language Models (ALLMs): A New Era for Arabic Language Technology

arXiv ·

This article surveys the landscape of Arabic Large Language Models (ALLMs), tracing their evolution from early text processing systems to sophisticated AI models. It highlights the unique challenges and opportunities in developing ALLMs for the 422 million Arabic speakers across 27 countries. The paper also examines the evaluation of ALLMs through benchmarks and public leaderboards. Why it matters: ALLMs can bridge technological gaps and empower Arabic-speaking communities by catering to their specific linguistic and cultural needs.

Empowering Large Language Models with Reliable Reasoning

MBZUAI ·

Liangming Pan from UCSB presented research on building reliable generative AI agents by integrating symbolic representations with LLMs. The neuro-symbolic strategy combines the flexibility of language models with precise knowledge representation and verifiable reasoning. The work covers Logic-LM, ProgramFC, and learning from automated feedback, aiming to address LLM limitations in complex reasoning tasks. Why it matters: Improving the reliability of LLMs is crucial for high-stakes applications in finance, medicine, and law within the region and globally.

Cultural inclusivity in AI: A new benchmark dataset on 100 languages

MBZUAI ·

MBZUAI researchers have released ALM Bench, a new benchmark dataset for evaluating the performance of multimodal LLMs on cultural visual question-answer tasks across 100 languages. The dataset includes over 22,000 question-answer pairs across 19 categories, with a focus on low-resource languages and cultural nuances, including three Arabic dialects. They tested 16 open- and closed-source multimodal LLMs on it, revealing a significant need for greater cultural and linguistic inclusivity. Why it matters: The benchmark aims to improve the inclusivity of multimodal AI systems by addressing the underrepresentation of low-resource languages and cultural contexts.

Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs

arXiv ·

A new culturally inclusive and linguistically diverse dataset called Palm for Arabic LLMs is introduced, covering 22 Arab countries and featuring instructions in both Modern Standard Arabic (MSA) and dialectal Arabic (DA) across 20 topics. The dataset was built through a year-long community-driven project involving 44 researchers from across the Arab world. Evaluation of frontier LLMs using the dataset reveals limitations in cultural and dialectal understanding, with some countries being better represented than others.