Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect

arXiv · September 26, 2024 · Significant research

Summary

Researchers developed Atlas-Chat, a collection of LLMs for dialectal Arabic, focusing on Moroccan Arabic (Darija). They constructed an instruction dataset by consolidating existing Darija language resources and translating English instructions. Atlas-Chat models (2B, 9B, 27B) outperform state-of-the-art and Arabic-specialized LLMs like LLaMa, Jais, and AceGPT on Darija NLP tasks. Why it matters: This work addresses the gap in LLM support for low-resource Arabic dialects, providing a methodology for instruction-tuning and benchmarks for future research.

Keywords

Atlas-Chat · Darija · Moroccan Arabic · LLM · Instruction Tuning

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.