Skip to content
GCC AI Research

AceGPT, Localizing Large Language Models in Arabic

arXiv · · Significant research

Summary

Researchers introduce AceGPT, a localized large language model (LLM) specifically for Arabic, addressing cultural sensitivity and local values not well-represented in mainstream models. AceGPT incorporates further pre-training with Arabic texts, supervised fine-tuning using native Arabic instructions and GPT-4 responses, and reinforcement learning with AI feedback using a reward model attuned to local culture. Evaluations demonstrate that AceGPT achieves state-of-the-art performance among open Arabic LLMs across several benchmarks. Why it matters: This work advances culturally-aware AI development for Arabic-speaking communities, providing a valuable resource and benchmark for future research.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

ArabianGPT: Native Arabic GPT-based Large Language Model

arXiv ·

The paper introduces ArabianGPT, a suite of transformer-based language models designed specifically for Arabic, including versions with 0.1B and 0.3B parameters. A key component is the AraNizer tokenizer, tailored for Arabic script's morphology. Fine-tuning ArabianGPT-0.1B achieved 95% accuracy in sentiment analysis, up from 56% in the base model, and improved F1 scores in summarization. Why it matters: The models address the gap in native Arabic LLMs, offering better performance on Arabic NLP tasks through tailored architecture and tokenization.

Introducing the Open Arabic LLM Leaderboard: Empowering the Arabic Language Modeling Community

TII ·

The Open Arabic LLM Leaderboard (OALL) has been launched to benchmark Arabic language models, addressing the gap in resources for non-English NLP. It incorporates datasets like AlGhafa, ACVA, and translated versions of MMLU and EXAMS from the AceGPT suite. The leaderboard uses normalized log likelihood accuracy for tasks, built around HuggingFace’s LightEval framework. Why it matters: This initiative promotes research and development in Arabic NLP, serving over 380 million Arabic speakers by enhancing the evaluation and improvement of Arabic LLMs.

AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs

arXiv ·

Researchers introduce AraDiCE, a benchmark for Arabic Dialect and Cultural Evaluation, comprising seven synthetic datasets in various dialects and Modern Standard Arabic (MSA). The benchmark includes approximately 45,000 post-edited samples and evaluates LLMs on dialect comprehension, generation, and cultural awareness across the Gulf, Egypt, and Levant. Results show that Arabic-specific models like Jais and AceGPT outperform multilingual models on dialectal tasks, but challenges remain in dialect identification, generation, and translation. Why it matters: This benchmark and associated datasets will help improve LLMs' ability to understand and generate diverse Arabic dialects and cultural contexts, addressing a significant gap in current models.

Large Language Models and Arabic Content: A Review

arXiv ·

This study reviews the use of large language models (LLMs) for Arabic language processing, focusing on pre-trained models and their applications. It highlights the challenges in Arabic NLP due to the language's complexity and the relative scarcity of resources. The review also discusses how techniques like fine-tuning and prompt engineering enhance model performance on Arabic benchmarks. Why it matters: This overview helps consolidate research directions and benchmarks in Arabic NLP, guiding future development of LLMs tailored for the Arabic language and its diverse dialects.